date:20220506

Re: [DISCUSS] DockerHub repository maintainers

2022-05-06 Thread Till Rohrmann

Hi everyone,

thanks for starting this discussion Xintong. I would volunteer as a
maintainer of the flink-statefun Docker repository if you need one.

Cheers,
Till

On Fri, May 6, 2022 at 6:22 AM Xintong Song  wrote:

> It seems to me we at least don't have a consensus on dropping the use of
> apache namespace, which means we need to decide on a list of maintainers
> anyway. So maybe we can get the discussion back to the maintainers. We may
> continue the official-image vs. apache-namespace in a separate thread if
> necessary.
>
> As mentioned previously, we need to reduce the number of maintainers from
> 20 to 5, as required by INFRA. Jingsong and I would like to volunteer as 2
> of the 5, and we would like to learn who else wants to join us. Of course
> the list of maintainers can be modified later.
>
> *This also means the current maintainers may be removed from the list.*
> Please let us know if you still need that privilege. CC-ed all the current
> maintainers for attention.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Wed, May 4, 2022 at 3:14 PM Chesnay Schepler 
> wrote:
>
> > One advantage is that the images are periodically rebuilt to get
> > security fixes.
> >
> > The operator is a different story anyway because it is AFAIK only
> > supposed to be used via docker
> > (i.e., no standalone mode), which alleviates concerns about keeping the
> > logic within the image
> > to a minimum (which bit us in the past on the flink side).
> >
> > On 03/05/2022 16:09, Yang Wang wrote:
> > > The flink-kubernetes-operator project is only published
> > > via apache/flink-kubernetes-operator on docker hub and github packages.
> > > We do not find the obvious advantages by using docker hub official
> > images.
> > >
> > > Best,
> > > Yang
> > >
> > > Xintong Song  于2022年4月28日周四 19:27写道：
> > >
> > >> I agree with you that doing QA for the image after the release has
> been
> > >> finalized doesn't feel right. IIUR, that is mostly because official
> > image
> > >> PR needs 1) the binary release being deployed and propagated and 2)
> the
> > >> corresponding git commit being specified. I'm not completely sure
> about
> > >> this. Maybe we can improve the process by investigating more about the
> > >> feasibility of pre-verifying an official image PR before finalizing
> the
> > >> release. It's definitely a good thing to do if possible.
> > >>
> > >> I also agree that QA from DockerHub folks is valuable to us.
> > >>
> > >> I'm not against publishing official-images, and I'm not against
> working
> > >> closely with the DockerHub folks to improve the process of delivering
> > the
> > >> official image. However, I don't think these should become reasons
> that
> > we
> > >> don't release our own apache/flink images.
> > >>
> > >> Taking the 1.12.0 as an example, admittedly it would be nice for us to
> > >> comply with the DockerHub folks' standards and not have a
> > >> just-for-kubernetes command in our entrypoint. However, this is IMO
> far
> > >> less important compared to delivering the image to our users timely. I
> > >> guess that's where the DockerHub folks and us have different
> > >> priorities, and that's why I think we should have a path that is fully
> > >> controlled by this community to deliver images. We could take their
> > >> valuable inputs and improve afterwards. Actually, that's what we did
> for
> > >> 1.12.0 by starting to release to apache/flink.
> > >>
> > >> Thank you~
> > >>
> > >> Xintong Song
> > >>
> > >>
> > >>
> > >> On Thu, Apr 28, 2022 at 6:30 PM Chesnay Schepler 
> > >> wrote:
> > >>
> > >>> I still think that's mostly a process issue.
> > >>> Of course we can be blind-sided if we do the QA for a release
> artifact
> > >>> after the release has been finalized.
> > >>> But that's a clearly broken process from the get-go.
> > >>>
> > >>> At the very least we should already open a PR when the RC is created
> to
> > >>> get earlier feedback.
> > >>>
> > >>> Moreover, nowadays the docker images are way slimmer and we are much
> > >>> more careful on what is actually added to the scripts.
> > >>>
> > >>> Finally, the problems they found did show that their QA is very
> > valuable
> > >>> to us. And side-stepping that for such an essential piece of a
> release
> > >>> isn't a good idea imo.
> > >>>
> > >>> On 28/04/2022 11:31, Xintong Song wrote:
> >  I'm overall against only releasing to official-images.
> > 
> >  We started releasing to apache/flink, in addition to the
> > >> official-image,
> > >>> in
> >  1.12.0. That was because releasing the official-image needs approval
> > >> from
> >  the DockerHub folks, which is not under control of the Flink
> > community.
> > >>> For
> >  1.12.0 there were unfortunately some divergences between us and the
> >  DockerHub folks, and it ended-up taking us nearly 2 months to get
> that
> >  official-image PR merged [1][2]. Many users, especially those who
> need
> >  Flink's K8s & Native-K8s deployment modes, were as

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-06 Thread rui fan

Hi Anton, Piotrek and Dawid,

Thanks for your help.

I created FLINK-27522[1] as the first task. And I will finish it asap.

@Piotrek, for the default value, do you think it should be less
than 5? What do you think about 3? Actually, I think 5 isn't big.
It's 1 or 3 or 5 that doesn't matter much, the focus is on
reasonably resolving deadlock problems. Or I push the second
task to move forward first and we discuss the default value in PR.

For the legacySource, I got your idea. And I propose we create
the third task to handle it. Because it is independent and for
compatibility with the old API. What do you think? I updated
the third task on FLIP-227[2].

If all is ok, I will create a JIRA for the third Task and add it to
FLIP-227. And I will develop them from the first task to the
third task.

Thanks again for your help.

[1] https://issues.apache.org/jira/browse/FLINK-27522
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-227%3A+Support+overdraft+buffer

Thanks
fanrui

On Fri, May 6, 2022 at 3:50 AM Piotr Nowojski  wrote:

> Hi fanrui,
>
> > How to identify legacySource?
>
> legacy sources are always using the SourceStreamTask class and
> SourceStreamTask is used only for legacy sources. But I'm not sure how to
> enable/disable that. Adding `disableOverdraft()` call in SourceStreamTask
> would be better compared to relying on the `getAvailableFuture()` call
> (isn't it used for back pressure metric anyway?). Ideally we should
> enable/disable it in the constructors, but that might be tricky.
>
> > I prefer it to be between 5 and 10
>
> I would vote for a smaller value because of FLINK-13203
>
> Piotrek
>
>
>
> czw., 5 maj 2022 o 11:49 rui fan <1996fan...@gmail.com> napisał(a):
>
>> Hi,
>>
>> Thanks a lot for your discussion.
>>
>> After several discussions, I think it's clear now. I updated the
>> "Proposed Changes" of FLIP-227[1]. If I have something
>> missing, please help to add it to FLIP, or add it in the mail
>> and I can add it to FLIP. If everything is OK, I will create a
>> new JIRA for the first task, and use FLINK-26762[2] as the
>> second task.
>>
>> About the legacy source, do we set maxOverdraftBuffersPerGate=0
>> directly? How to identify legacySource? Or could we add
>> the overdraftEnabled in LocalBufferPool? The default value
>> is false. If the getAvailableFuture is called, change
>> overdraftEnabled=true.
>> It indicates whether there are checks isAvailable elsewhere.
>> It might be more general, it can cover more cases.
>>
>> Also, I think the default value of 'max-overdraft-buffers-per-gate'
>> needs to be confirmed. I prefer it to be between 5 and 10. How
>> do you think?
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-227%3A+Support+overdraft+buffer
>> [2] https://issues.apache.org/jira/browse/FLINK-26762
>>
>> Thanks
>> fanrui
>>
>> On Thu, May 5, 2022 at 4:41 PM Piotr Nowojski 
>> wrote:
>>
>>> Hi again,
>>>
>>> After sleeping over this, if both versions (reserve and overdraft) have
>>> the same complexity, I would also prefer the overdraft.
>>>
>>> > `Integer.MAX_VALUE` as default value was my idea as well but now, as
>>> > Dawid mentioned, I think it is dangerous since it is too implicit for
>>> > the user and if the user submits one more job for the same TaskManger
>>>
>>> As I mentioned, it's not only an issue with multiple jobs. The same
>>> problem can happen with different subtasks from the same job, potentially
>>> leading to the FLINK-13203 deadlock [1]. With FLINK-13203 fixed, I would be
>>> in favour of Integer.MAX_VALUE to be the default value, but as it is, I
>>> think we should indeed play on the safe side and limit it.
>>>
>>> > I still don't understand how should be limited "reserve"
>>> implementation.
>>> > I mean if we have X buffers in total and the user sets overdraft equal
>>> > to X we obviously can not reserve all buffers, but how many we are
>>> > allowed to reserve? Should it be a different configuration like
>>> > percentegeForReservedBuffers?
>>>
>>> The reserve could be defined as percentage, or as a fixed number of
>>> buffers. But yes. In normal operation subtask would not use the reserve, as
>>> if numberOfAvailableBuffers < reserve, the output would be not available.
>>> Only in the flatMap/timers/huge records case the reserve could be used.
>>>
>>> > 1. If the total buffers of LocalBufferPool <= the reserve buffers,
>>> will LocalBufferPool never be available? Can't process data?
>>>
>>> Of course we would need to make sure that never happens. So the reserve
>>> should be < total buffer size.
>>>
>>> > 2. If the overdraft buffer use the extra buffers, when the downstream
>>> > task inputBuffer is insufficient, it should fail to start the job, and
>>> then
>>> > restart? When the InputBuffer is initialized, it will apply for enough
>>> > buffers, right?
>>>
>>> The failover if downstream can not allocate buffers is already
>>> implemented FLINK-14872 [2]. There is a timeout for how long the task is
>>> waiting for buff

[jira] [Created] (FLINK-27526) Support scaling bucket number for FileStore

2022-05-06 Thread Jane Chan (Jira)

Jane Chan created FLINK-27526:
-

 Summary: Support scaling bucket number for FileStore
 Key: FLINK-27526
 URL: https://issues.apache.org/jira/browse/FLINK-27526
 Project: Flink
  Issue Type: New Feature
  Components: Table Store
Affects Versions: table-store-0.2.0
Reporter: Jane Chan
 Fix For: table-store-0.2.0






--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (FLINK-27527) Create a file-based Upsert sink for testing internal components

2022-05-06 Thread Alexander Preuss (Jira)

Alexander Preuss created FLINK-27527:


 Summary: Create a file-based Upsert sink for testing internal 
components
 Key: FLINK-27527
 URL: https://issues.apache.org/jira/browse/FLINK-27527
 Project: Flink
  Issue Type: New Feature
  Components: Connectors / Common
Affects Versions: 1.16.0
Reporter: Alexander Preuss


There are a bunch of tests that in order to ensure correctness of their tested 
component rely on a Sink providing upserts. These tests (e.g. 
test-sql-client.sh) mostly use the ElasticsearchSink which is a lot of 
overhead. We want to provide a simple file-based upsert sink for Flink 
developers to test their components against. The sink should be very simple and 
is not supposed to be used in production scenarios but rather just to 
facilitate easier testing.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Facing Difficulty in Deseralizing the Avro messages

2022-05-06 Thread Vishist Bhoopalam

HI 

I have explained the problem in the depth in stack Overflow hoping for quick 
response since have a dead line to catch.

https://stackoverflow.com/questions/72137036/how-to-deseralize-avro-response-getting-from-datastream-scala-apache-flink

Thanks and Regards
BR Vishist

[jira] [Created] (FLINK-27528) Introduce a new configuration option key 'compact-rescale'

2022-05-06 Thread Jane Chan (Jira)

Jane Chan created FLINK-27528:
-

 Summary: Introduce a new configuration option key 'compact-rescale'
 Key: FLINK-27528
 URL: https://issues.apache.org/jira/browse/FLINK-27528
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Affects Versions: table-store-0.2.0
Reporter: Jane Chan
 Fix For: table-store-0.2.0


This option will be added to FileStoreOptions to control compaction behavior



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (FLINK-27529) HybridSourceSplitEnumerator sourceIndex use error Integer check

2022-05-06 Thread Ran Tao (Jira)

Ran Tao created FLINK-27529:
---

 Summary: HybridSourceSplitEnumerator sourceIndex use error Integer 
check
 Key: FLINK-27529
 URL: https://issues.apache.org/jira/browse/FLINK-27529
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Common
Affects Versions: 1.14.4, 1.15.0, 1.15.1
Reporter: Ran Tao


Currently HybridSourceSplitEnumerator check readerSourceIndex using Integer 
type but == operator.  In some case, it will cause error(Integer == only works 
fine in [-127,128]) we can use Integer.equals instead. But actually 
readerSourceIndex is primitive int intrinsically，so we can change Integer to 
int to check sourceIndex instead of Integer.equals method.  it will be more 
elegant.

{code:java}
@Override
public Map registeredReaders() {
// TODO: not start enumerator until readers are ready?
Map readers = realContext.registeredReaders();
if (readers.size() != readerSourceIndex.size()) {
return filterRegisteredReaders(readers);
}
Integer lastIndex = null;
for (Integer sourceIndex : readerSourceIndex.values()) {
if (lastIndex != null && lastIndex != sourceIndex) {
return filterRegisteredReaders(readers);
}
lastIndex = sourceIndex;
}
return readers;
}

private Map filterRegisteredReaders(Map readers) {
Map readersForSource = new 
HashMap<>(readers.size());
for (Map.Entry e : readers.entrySet()) {
if (readerSourceIndex.get(e.getKey()) == (Integer) sourceIndex) 
{
readersForSource.put(e.getKey(), e.getValue());
}
}
return readersForSource;
}
{code}




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-06 Thread Piotr Nowojski

Hi,

I'm not sure. Maybe 5 will be fine? Anton, Dawid, what do you think?

Can you create a parent ticket for the whole FLIP to group all of the
issues together?

Also FLIP should be officially voted first.

Best,
Piotrek

pt., 6 maj 2022 o 09:08 rui fan <1996fan...@gmail.com> napisał(a):

> Hi Anton, Piotrek and Dawid,
>
> Thanks for your help.
>
> I created FLINK-27522[1] as the first task. And I will finish it asap.
>
> @Piotrek, for the default value, do you think it should be less
> than 5? What do you think about 3? Actually, I think 5 isn't big.
> It's 1 or 3 or 5 that doesn't matter much, the focus is on
> reasonably resolving deadlock problems. Or I push the second
> task to move forward first and we discuss the default value in PR.
>
> For the legacySource, I got your idea. And I propose we create
> the third task to handle it. Because it is independent and for
> compatibility with the old API. What do you think? I updated
> the third task on FLIP-227[2].
>
> If all is ok, I will create a JIRA for the third Task and add it to
> FLIP-227. And I will develop them from the first task to the
> third task.
>
> Thanks again for your help.
>
> [1] https://issues.apache.org/jira/browse/FLINK-27522
> [2]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-227%3A+Support+overdraft+buffer
>
> Thanks
> fanrui
>
> On Fri, May 6, 2022 at 3:50 AM Piotr Nowojski 
> wrote:
>
> > Hi fanrui,
> >
> > > How to identify legacySource?
> >
> > legacy sources are always using the SourceStreamTask class and
> > SourceStreamTask is used only for legacy sources. But I'm not sure how to
> > enable/disable that. Adding `disableOverdraft()` call in SourceStreamTask
> > would be better compared to relying on the `getAvailableFuture()` call
> > (isn't it used for back pressure metric anyway?). Ideally we should
> > enable/disable it in the constructors, but that might be tricky.
> >
> > > I prefer it to be between 5 and 10
> >
> > I would vote for a smaller value because of FLINK-13203
> >
> > Piotrek
> >
> >
> >
> > czw., 5 maj 2022 o 11:49 rui fan <1996fan...@gmail.com> napisał(a):
> >
> >> Hi,
> >>
> >> Thanks a lot for your discussion.
> >>
> >> After several discussions, I think it's clear now. I updated the
> >> "Proposed Changes" of FLIP-227[1]. If I have something
> >> missing, please help to add it to FLIP, or add it in the mail
> >> and I can add it to FLIP. If everything is OK, I will create a
> >> new JIRA for the first task, and use FLINK-26762[2] as the
> >> second task.
> >>
> >> About the legacy source, do we set maxOverdraftBuffersPerGate=0
> >> directly? How to identify legacySource? Or could we add
> >> the overdraftEnabled in LocalBufferPool? The default value
> >> is false. If the getAvailableFuture is called, change
> >> overdraftEnabled=true.
> >> It indicates whether there are checks isAvailable elsewhere.
> >> It might be more general, it can cover more cases.
> >>
> >> Also, I think the default value of 'max-overdraft-buffers-per-gate'
> >> needs to be confirmed. I prefer it to be between 5 and 10. How
> >> do you think?
> >>
> >> [1]
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-227%3A+Support+overdraft+buffer
> >> [2] https://issues.apache.org/jira/browse/FLINK-26762
> >>
> >> Thanks
> >> fanrui
> >>
> >> On Thu, May 5, 2022 at 4:41 PM Piotr Nowojski 
> >> wrote:
> >>
> >>> Hi again,
> >>>
> >>> After sleeping over this, if both versions (reserve and overdraft) have
> >>> the same complexity, I would also prefer the overdraft.
> >>>
> >>> > `Integer.MAX_VALUE` as default value was my idea as well but now, as
> >>> > Dawid mentioned, I think it is dangerous since it is too implicit for
> >>> > the user and if the user submits one more job for the same TaskManger
> >>>
> >>> As I mentioned, it's not only an issue with multiple jobs. The same
> >>> problem can happen with different subtasks from the same job,
> potentially
> >>> leading to the FLINK-13203 deadlock [1]. With FLINK-13203 fixed, I
> would be
> >>> in favour of Integer.MAX_VALUE to be the default value, but as it is, I
> >>> think we should indeed play on the safe side and limit it.
> >>>
> >>> > I still don't understand how should be limited "reserve"
> >>> implementation.
> >>> > I mean if we have X buffers in total and the user sets overdraft
> equal
> >>> > to X we obviously can not reserve all buffers, but how many we are
> >>> > allowed to reserve? Should it be a different configuration like
> >>> > percentegeForReservedBuffers?
> >>>
> >>> The reserve could be defined as percentage, or as a fixed number of
> >>> buffers. But yes. In normal operation subtask would not use the
> reserve, as
> >>> if numberOfAvailableBuffers < reserve, the output would be not
> available.
> >>> Only in the flatMap/timers/huge records case the reserve could be used.
> >>>
> >>> > 1. If the total buffers of LocalBufferPool <= the reserve buffers,
> >>> will LocalBufferPool never be available? Can't process data?
> >>>
> >>> Of course

[jira] [Created] (FLINK-27530) Support overdraft buffer

2022-05-06 Thread fanrui (Jira)

fanrui created FLINK-27530:
--

 Summary: Support overdraft buffer
 Key: FLINK-27530
 URL: https://issues.apache.org/jira/browse/FLINK-27530
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Checkpointing, Runtime / Network
Reporter: fanrui


This is the umbrella issue for the feature of unaligned checkpoints. Refer to 
the 
[FLIP-227|https://cwiki.apache.org/confluence/display/FLINK/FLIP-227%3A+Support+overdraft+buffer]
  for more details.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (FLINK-27531) Add explict doc on data consistency problem of RocksDB's map state iterator

2022-05-06 Thread Yun Tang (Jira)

Yun Tang created FLINK-27531:


 Summary: Add explict doc on data consistency problem of RocksDB's 
map state iterator
 Key: FLINK-27531
 URL: https://issues.apache.org/jira/browse/FLINK-27531
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Reporter: Yun Tang
Assignee: Yun Tang


Since RocksDB map state is introduced, there exists data consistency problem 
for iteration. This is not well documented and deserves to add explict 
documentation on this.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Re: [DISCUSS] FLIP-227: Support overdraft buffer

2022-05-06 Thread rui fan

Hi

I created the FLINK-27530[1] as the parent ticket. And I
updated it to FLIP.

[1] https://issues.apache.org/jira/browse/FLINK-27530

Thanks
fanrui

On Fri, May 6, 2022 at 4:27 PM Piotr Nowojski  wrote:

> Hi,
>
> I'm not sure. Maybe 5 will be fine? Anton, Dawid, what do you think?
>
> Can you create a parent ticket for the whole FLIP to group all of the
> issues together?
>
> Also FLIP should be officially voted first.
>
> Best,
> Piotrek
>
> pt., 6 maj 2022 o 09:08 rui fan <1996fan...@gmail.com> napisał(a):
>
> > Hi Anton, Piotrek and Dawid,
> >
> > Thanks for your help.
> >
> > I created FLINK-27522[1] as the first task. And I will finish it asap.
> >
> > @Piotrek, for the default value, do you think it should be less
> > than 5? What do you think about 3? Actually, I think 5 isn't big.
> > It's 1 or 3 or 5 that doesn't matter much, the focus is on
> > reasonably resolving deadlock problems. Or I push the second
> > task to move forward first and we discuss the default value in PR.
> >
> > For the legacySource, I got your idea. And I propose we create
> > the third task to handle it. Because it is independent and for
> > compatibility with the old API. What do you think? I updated
> > the third task on FLIP-227[2].
> >
> > If all is ok, I will create a JIRA for the third Task and add it to
> > FLIP-227. And I will develop them from the first task to the
> > third task.
> >
> > Thanks again for your help.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-27522
> > [2]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-227%3A+Support+overdraft+buffer
> >
> > Thanks
> > fanrui
> >
> > On Fri, May 6, 2022 at 3:50 AM Piotr Nowojski 
> > wrote:
> >
> > > Hi fanrui,
> > >
> > > > How to identify legacySource?
> > >
> > > legacy sources are always using the SourceStreamTask class and
> > > SourceStreamTask is used only for legacy sources. But I'm not sure how
> to
> > > enable/disable that. Adding `disableOverdraft()` call in
> SourceStreamTask
> > > would be better compared to relying on the `getAvailableFuture()` call
> > > (isn't it used for back pressure metric anyway?). Ideally we should
> > > enable/disable it in the constructors, but that might be tricky.
> > >
> > > > I prefer it to be between 5 and 10
> > >
> > > I would vote for a smaller value because of FLINK-13203
> > >
> > > Piotrek
> > >
> > >
> > >
> > > czw., 5 maj 2022 o 11:49 rui fan <1996fan...@gmail.com> napisał(a):
> > >
> > >> Hi,
> > >>
> > >> Thanks a lot for your discussion.
> > >>
> > >> After several discussions, I think it's clear now. I updated the
> > >> "Proposed Changes" of FLIP-227[1]. If I have something
> > >> missing, please help to add it to FLIP, or add it in the mail
> > >> and I can add it to FLIP. If everything is OK, I will create a
> > >> new JIRA for the first task, and use FLINK-26762[2] as the
> > >> second task.
> > >>
> > >> About the legacy source, do we set maxOverdraftBuffersPerGate=0
> > >> directly? How to identify legacySource? Or could we add
> > >> the overdraftEnabled in LocalBufferPool? The default value
> > >> is false. If the getAvailableFuture is called, change
> > >> overdraftEnabled=true.
> > >> It indicates whether there are checks isAvailable elsewhere.
> > >> It might be more general, it can cover more cases.
> > >>
> > >> Also, I think the default value of 'max-overdraft-buffers-per-gate'
> > >> needs to be confirmed. I prefer it to be between 5 and 10. How
> > >> do you think?
> > >>
> > >> [1]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-227%3A+Support+overdraft+buffer
> > >> [2] https://issues.apache.org/jira/browse/FLINK-26762
> > >>
> > >> Thanks
> > >> fanrui
> > >>
> > >> On Thu, May 5, 2022 at 4:41 PM Piotr Nowojski 
> > >> wrote:
> > >>
> > >>> Hi again,
> > >>>
> > >>> After sleeping over this, if both versions (reserve and overdraft)
> have
> > >>> the same complexity, I would also prefer the overdraft.
> > >>>
> > >>> > `Integer.MAX_VALUE` as default value was my idea as well but now,
> as
> > >>> > Dawid mentioned, I think it is dangerous since it is too implicit
> for
> > >>> > the user and if the user submits one more job for the same
> TaskManger
> > >>>
> > >>> As I mentioned, it's not only an issue with multiple jobs. The same
> > >>> problem can happen with different subtasks from the same job,
> > potentially
> > >>> leading to the FLINK-13203 deadlock [1]. With FLINK-13203 fixed, I
> > would be
> > >>> in favour of Integer.MAX_VALUE to be the default value, but as it
> is, I
> > >>> think we should indeed play on the safe side and limit it.
> > >>>
> > >>> > I still don't understand how should be limited "reserve"
> > >>> implementation.
> > >>> > I mean if we have X buffers in total and the user sets overdraft
> > equal
> > >>> > to X we obviously can not reserve all buffers, but how many we are
> > >>> > allowed to reserve? Should it be a different configuration like
> > >>> > percentegeForReservedBuffers?
> > >>>
> > >>> The

Re: [DISCUSS] FLIP-91: Support SQL Client Gateway

2022-05-06 Thread LuNing Wang

Thanks, Shengkai for driving.  And all for your discussion.



> intergate the Gateway into the Flink code base

After I talk with Shengkai offline and read the topic `Powering HTAP at
ByteDance with Apache Flink` of Flink Forward Asia. I think it is better to
integrate Gateway code into the Flink codebase.


In the future, we can add a feature that merges SQL gateway into
JobManager. We can request JobManager API to directly submit the Flink SQL
job. It will further improve the performance of Flink OLAP.  In the future,
the Flink must be a unified engine for batch, stream, and OLAP. The
Presto/Trino directly requests the master node to submit a job, if so, we
can reduce Q&M in Flink session mode. Perhaps, the Flink application mode
can’t merge SQL gateway into JobManager, but Flink OLAP almost always uses
session mode.

> Gateway to support multiple Flink versions


If we will merge the SQL gateway into JobManager, the SQL Gateway itself
can adapt only one Flink version. We could import a Network Gateway to
redirect requests to Gateway or JobManager of various versions. Perhaps,
the network gateway uses other projects, like Apache Kyuubi or Zeppelin,
etc.

> I don't think that the Gateway is a 'core' function of Flink which should

be included with Flink.

In the production environment, Flink SQL always uses a Gateway. This point
can be observed in the user email lists and some Flink Forward topics. The
SQL Gateway is an important infrastructure for big data compute engine. As
the Flink has not it, many Flink users achieve SQL Gateway in the Apache
Kyuubi project, but it should be the work of official Flink.

> I think it's fine to move this functionlity to the client rather than

gateway. WDYT?

I agree with the `init-file` option in the client. I think the `init-file`
functionality in Gateway is NOT important in the first version of Gateway.
Now, the hive JDBC option ‘initFile’ already has this functionality. After
SQL Gateway releases and we observe feedback from the community, we maybe
will discuss this problem again.

Best,

LuNing Wang.


Shengkai Fang  于2022年5月6日周五 14:34写道：

> Thanks Martijn, Nicholas, Godfrey, Jark and Jingsong feedback
>
> > I would like to understand why it's complicated to make the upgrades
> > problematic
>
> I aggree with Jark's point. The API is not very stable in the Flink
> actually. For example, the Gateway relies on the planner. But in
> release-1.14 Flink renames the blink planner package. In release-1.15 Flink
> makes the planner scala free, which means other projects should not
> directly rely on the planner.
>
> >  Does the Flink SQL gateway support submitting a batch job?
>
> Of course. In the SQL Gateway, you can just use the sql SET
> 'execution.runtime-mode' = 'batch' to switch to the batch environment. Then
> the job you submit later will be executed in the batch mode.
>
> > The architecture of the Gateway is in the following graph.
> Is the TableEnvironment shared for all sessions ?
>
> No. Every session has its individual TableEnvironment. I have modified the
> graph to make everything more clear.
>
> > /v1/sessions
> >> Are both local file and remote file supported for `libs` and `jars`?
>
> We don't limit the usage here. But I think we will only support the local
> file in the next version.
>
> >> Does sql gateway support upload files?
>
> No. We need a new API to do this. We can collect more user feedback to
> determine whether we need to implement this feature.
>
> >/v1/sessions/:session_handle/configure_session
> >> Can this api be replaced with `/v1/sessions/:session_handle/statements`
> ?
>
> Actually the API above is different. The
> `/v1/sessions/:session_handle/configure_session` API uses SQL to configure
> the environment, which only allows the limited types of SQL. But the
> `/v1/sessions/:session_handle/statements` has no limitation. I think we'd
> better use a different API to distinguish these.
>
> >/v1/sessions/:session_id/operations/:operation_handle/status
> >>`:session_id` is a typo, it should be `:session_handdle`
>
> Yes. I have fixed the mistake.
>
> >/v1/sessions/:session_handle/statements
> >The statement must be a single command
>
> >> Does this api support `begin statement set ... end` or `statement set
> >> begin ... end`?
>
> For BEGIN STATEMENT SET, it will open a buffer in the Session and allows
> the users to submit the insert statement into the Session later. When the
> Session receives the END statement, the Gateway will submit the buffered
> statements.
>
> For STATEMENT SET BEGIN ... END, the parser is able to parse the statement.
> We can treat it as other SQL.
>
> >> DO `ADD JAR`, `REMOVE JAR` support ? If yes, how to manage the jars?
>
> For ADD JAR/REMOVE JAR, if the jar is in the local environment, we will
> just add it into the class path or remove it from the class path. If the
> jar is the remote jar, we will create a session level directory and
> download the jar into the directory. When the session closes, it should

Fwd: [Discuss] Creating an Apache Flink slack workspace

2022-05-06 Thread Xintong Song

Thank you~

Xintong Song



-- Forwarded message -
From: Xintong Song 
Date: Fri, May 6, 2022 at 5:07 PM
Subject: Re: [Discuss] Creating an Apache Flink slack workspace
To: private 
Cc: Chesnay Schepler 


Hi Chesnay,

Correct me if I'm wrong, I don't find this is *repeatedly* discussed on the
ML. The only discussions I find are [1] & [2], which are 4 years ago. On
the other hand, I do find many users are asking questions about whether
Slack should be supported [2][3][4]. Besides, I also find a recent
discussion thread from ComDev [5], where alternative communication channels
are being discussed. It seems to me ASF is quite open to having such
additional channels and they have been worked well for many projects
already.

I see two reasons for brining this discussion again:
1. There are indeed many things that have change during the past 4 years.
We have more contributors, including committers and PMC members, and even
more users from various organizations and timezones. That also means more
discussions and Q&As are happening.
2. The proposal here is different from the previous discussion. Instead of
maintaining a channel for Flink in the ASF workspace, here we are proposing
to create a dedicated Apache Flink slack workspace. And instead of *moving*
the discussion to Slack, we are proposing to add a Slack Workspace as an
addition to the ML.

Below is your opinions that I found from your previous -1 [1]. IIUR, these
are all about the using ASF Slack Workspace. If I overlooked anything,
please let me know.

> 1. According to INFRA-14292 <
> https://issues.apache.org/jira/browse/INFRA-14292> the ASF Slack isn't
> run by the ASF. This alone puts this service into rather questionable
> territory as it /looks/ like an official ASF service. If anyone can provide
> information to the contrary, please do so.

2. We already discuss things on the mailing lists, JIRA and GitHub. All of
> these are available to the public, whereas the slack channel requires an
> @apache mail address, i.e. you have to be a committer. This minimizes the
> target audience rather significantly. I would much rather prefer something
> that is also available to contributors.


I do agree this should be decided by the whole community. I'll forward this
to dev@ and user@ ML.

Thank you~

Xintong Song


[1] https://lists.apache.org/thread/gxwv49ssq82g06dbhy339x6rdxtlcv3d
[2] https://lists.apache.org/thread/kcym1sozkrtwxw1fjbnwk1nqrrlzolcc
[3] https://lists.apache.org/thread/7rmd3ov6sv3wwhflp97n4czz25hvmqm6
[4] https://lists.apache.org/thread/n5y1kzv50bkkbl3ys494dglyxl45bmts
[5] https://lists.apache.org/thread/fzwd3lj0x53hkq3od5ot0y719dn3kj1j

On Fri, May 6, 2022 at 3:05 PM Chesnay Schepler  wrote:

> This has been repeatedly discussed on the ML over the years and was
> rejected every time.
>
> I don't see that anything has changed that would invalidate the previously
> raised arguments against it, so I'm still -1 on it.
>
> This is also not something the PMC should decide anyway, but the project
> as a whole.
>
> On 06/05/2022 06:48, Jark Wu wrote:
>
> Thank Xintong, for starting this exciting topic.
>
> I think Slack would be an essential addition to the mailing list.
> I have talked with some Flink users, and they are surprised
> Flink doesn't have Slack yet, and they would love to use Slack.
> We can also see a trend that new open-source communities
> are using Slack as the community base camp.
>
> Slack is also helpful for brainstorming and asking people for opinions and
> use cases.
> I think Slack is not only another place for Q&A but also a connection to
> the Flink users.
> We can create more channels to make the community have more social
> attributes, for example,
>  - Share ideas, projects, integrations, articles, and presentations
> related to Flink in the #shows channel
>  - Flink releases, events in the #news channel
>
> Thus, I'm +1 to create an Apache Flink slack, and I can help set up the
> Flink slack and maintain it.
>
> Best,
> Jark
>
> On Fri, 6 May 2022 at 10:38, Xintong Song  wrote:
>
>> Hi all,
>>
>> I’d like to start a discussion on creating an Apache Flink slack
>> workspace.
>>
>> ## Motivation
>> Today many organizations choose to do real time communication through
>> slack. IMHO, we, Flink, as a technique for real time computing, should
>> embrace the more real time way for communication, especially for ad-hoc
>> questions and interactions. With more and more contributors from different
>> organizations joining this community, it would be good to provide a common
>> channel for such real time communications. Therefore, I'd propose to create
>> an Apache Flink slack workspace that is maintained by the Flink PMC.
>>
>> ## Benefits
>> - Easier to reach out to people. Messages are less likely overlooked.
>> - Realtime messages, voice / video calls, file transmissions that help
>> improve the communication efficiency.
>> - Finer-grained channels (e.g., flink-ml, flink-statefun, temporal
>> discussion channe

[jira] [Created] (FLINK-27532) Drop flink-clients test-jar

2022-05-06 Thread Chesnay Schepler (Jira)

Chesnay Schepler created FLINK-27532:


 Summary: Drop flink-clients test-jar
 Key: FLINK-27532
 URL: https://issues.apache.org/jira/browse/FLINK-27532
 Project: Flink
  Issue Type: Technical Debt
  Components: Build System, Command Line Client
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.16.0


The test-jar is actually unused and could be removed entirely.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Re: [ANNOUNCE] Apache Flink 1.15.0 released

2022-05-06 Thread Johannes Moser

Yes! 🎉

Thanks to the whole community.
All this involvement keeps impressing me.

> On 06.05.2022, at 01:42, Thomas Weise  wrote:
> 
> Thank you to all who contributed for making this release happen!
> 
> Thomas
> 
> On Thu, May 5, 2022 at 7:41 AM Zhu Zhu  wrote:
>> 
>> Thanks Yun, Till and Joe for the great work and thanks everyone who
>> makes this release possible!
>> 
>> Cheers,
>> Zhu
>> 
>> Jiangang Liu  于2022年5月5日周四 21:09写道：
>>> 
>>> Congratulations! This version is really helpful for us . We will explore it
>>> and help to improve it.
>>> 
>>> Best
>>> Jiangang Liu
>>> 
>>> Yu Li  于2022年5月5日周四 18:53写道：
>>> 
 Hurray!
 
 Thanks Yun Gao, Till and Joe for all the efforts as our release managers.
 And thanks all contributors for making this happen!
 
 Best Regards,
 Yu
 
 
 On Thu, 5 May 2022 at 18:01, Sergey Nuyanzin  wrote:
 
> Great news!
> Congratulations!
> Thanks to the release managers, and everyone involved.
> 
> On Thu, May 5, 2022 at 11:57 AM godfrey he  wrote:
> 
>> Congratulations~
>> 
>> Thanks Yun, Till and Joe for driving this release
>> and everyone who made this release happen.
>> 
>> Best,
>> Godfrey
>> 
>> Becket Qin  于2022年5月5日周四 17:39写道：
>>> 
>>> Hooray! Thanks Yun, Till and Joe for driving the release!
>>> 
>>> Cheers,
>>> 
>>> JIangjie (Becket) Qin
>>> 
>>> On Thu, May 5, 2022 at 5:20 PM Timo Walther 
> wrote:
>>> 
 It took a bit longer than usual. But I'm sure the users will love
> this
 release.
 
 Big thanks to the release managers!
 
 Timo
 
 Am 05.05.22 um 10:45 schrieb Yuan Mei:
> Great!
> 
> Thanks, Yun Gao, Till, and Joe for driving the release, and
 thanks
> to
> everyone for making this release happen!
> 
> Best
> Yuan
> 
> On Thu, May 5, 2022 at 4:40 PM Leonard Xu 
> wrote:
> 
>> Congratulations!
>> 
>> Thanks Yun Gao, Till and Joe for the great work as our release
>> manager
 and
>> everyone who involved.
>> 
>> Best,
>> Leonard
>> 
>> 
>> 
>>> 2022年5月5日 下午4:30，Yang Wang  写道：
>>> 
>>> Congratulations!
>>> 
>>> Thanks Yun Gao, Till and Joe for driving this release and
> everyone
>> who
>> made
>>> this release happen.
>> 
 
 
>> 
> 
> 
> --
> Best regards,
> Sergey
>

[REMINDER] Final Call for Presentations for Flink Forward San Francisco 2022

2022-05-06 Thread Timo Walther


Hi everyone,

I would like to send out a final reminder. We have already received some 
great submissions for FlinkForward San Francisco 2022. Nevertheless, we 
decided to extend the deadline by another week to give people a second 
chance to work on their abstracts and presentation ideas.


This is the final call to be a part of the event as a speaker - until 
11:59, May 12th PDT.


Any topic that can be categorized as
- Flink Use Cases
- Flink Operations
- Technology Deep Dives
- Ecosystem
- Community
is welcome.

NOTE: This will be an in-person event. However, if your country has 
travel restrictions, please let us know in the form. We will offer a 
limited number of slots for pre-recorded/remote Q&A talks to not exclude 
anyone.


https://www.flink-forward.org/sf-2022/call-for-presentations

In any case, it would be great to meet each other again!

Looking forward to the event,

Timo

Re: [DISCUSS] FLIP-224: Blacklist Mechanism

2022-05-06 Thread Jiangang Liu

Thanks for the valuable design. The auto-detecting can decrease great work
for us. We have implemented the similar feature in our inner flink version.
Below is something that I care about:

   1. For auto-detecting, I wonder how to make the strategy and mark a node
   blocked? Sometimes the blocked node is hard to be detected, for example,
   the upper node or the down node will be blocked when network unreachable.
   2. I see that the strategy is made in JobMaster side. How about
   implementing the similar logic in resource manager? In session mode, multi
   jobs can fail on the same bad node and the node should be marked blocked.
   If the job makes the strategy, the node may be not marked blocked if the
   fail times don't exceed the threshold.


Zhu Zhu  于2022年5月5日周四 23:35写道：

> Thank you for all your feedback!
>
> Besides the answers from Lijie, I'd like to share some of my thoughts:
> 1. Whether to enable automatical blocklist
> Generally speaking, it is not a goal of FLIP-224.
> The automatical way should be something built upon the blocklist
> mechanism and well decoupled. It was designed to be a configurable
> blocklist strategy, but I think we can further decouple it by
> introducing a abnormal node detector, as Becket suggested, which just
> uses the blocklist mechanism once bad nodes are detected. However, it
> should be a separate FLIP with further dev discussions and feedback
> from users. I also agree with Becket that different users have different
> requirements, and we should listen to them.
>
> 2. Is it enough to just take away abnormal nodes externally
> My answer is no. As Lijie has mentioned, we need a way to avoid
> deploying tasks to temporary hot nodes. In this case, users may just
> want to limit the load of the node and do not want to kill all the
> processes on it. Another case is the speculative execution[1] which
> may also leverage this feature to avoid starting mirror tasks on slow
> nodes.
>
> Thanks,
> Zhu
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+execution+for+Batch+Job
>
> Lijie Wang  于2022年5月5日周四 15:56写道：
>
> >
> > Hi everyone,
> >
> >
> > Thanks for your feedback.
> >
> >
> > There's one detail that I'd like to re-emphasize here because it can
> affect the value and design of the blocklist mechanism (perhaps I should
> highlight it in the FLIP). We propose two actions in FLIP:
> >
> > 1) MARK_BLOCKLISTED: Just mark the task manager or node as blocked.
> Future slots should not be allocated from the blocked task manager or node.
> But slots that are already allocated will not be affected. A typical
> application scenario is to mitigate machine hotspots. In this case, we hope
> that subsequent resource allocations will not be on the hot machine, but
> tasks currently running on it should not be affected.
> >
> > 2) MARK_BLOCKLISTED_AND_EVACUATE_TASKS: Mark the task manager or node as
> blocked, and evacuate all tasks on it. Evacuated tasks will be restarted on
> non-blocked task managers.
> >
> > For the above 2 actions, the former may more highlight the meaning of
> this FLIP, because the external system cannot do that.
> >
> >
> > Regarding *Manually* and *Automatically*, I basically agree with @Becket
> Qin: different users have different answers. Not all users’ deployment
> environments have a special external system that can perform the anomaly
> detection. In addition, adding pluggable/optional auto-detection doesn't
> require much extra work on top of manual specification.
> >
> >
> > I will answer your other questions one by one.
> >
> >
> > @Yangze
> >
> > a) I think you are right, we do not need to expose the
> `cluster.resource-blocklist.item.timeout-check-interval` to users.
> >
> > b) We can abstract the `notifyException` to a separate interface (maybe
> BlocklistExceptionListener), and the ResourceManagerBlocklistHandler can
> implement it in the future.
> >
> >
> > @Martijn
> >
> > a) I also think the manual blocking should be done by cluster operators.
> >
> > b) I think manual blocking makes sense, because according to my
> experience, users are often the first to perceive the machine problems
> (because of job failover or delay), and they will contact cluster operators
> to solve it, or even tell the cluster operators which machine is
> problematic. From this point of view, I think the people who really need
> the manual blocking are the users, and it’s just performed by the cluster
> operator, so I think the manual blocking makes sense.
> >
> >
> > @Chesnay
> >
> > We need to touch the logic of JM/SlotPool, because for MARK_BLOCKLISTED
> , we need to know whether the slot is blocklisted when the task is
> FINISHED/CANCELLED/FAILED. If so,  SlotPool should release the slot
> directly to avoid assigning other tasks (of this job) on it. If we only
> maintain the blocklist information on the RM, JM needs to retrieve it by
> RPC. I think the performance overhead of that is relatively large, so I
> think it's worth m

Re: [DISCUSS] FLIP-91: Support SQL Client Gateway

2022-05-06 Thread godfrey he

Hi Martijn, Shengkai

>I don't think that the Gateway is a 'core'
>function of Flink which should be included with Flink.
I have a different viewpoint. For SQL users, it provides out-of-box experience.
Just like SQL-client, users can enjoy variable sql experiences after
downloading/building a Flink package. In the design, SQL-client is
just a client/part of SQL-gateway.
We can find that, for HiveServer2 and Presto, the similar role is also built-in.

> the Gateway to support multiple Flink versions
I think this idea is good for users, who can use one service to
support multiple Flink versions. But I do not think current design
should support it.
As we know, the API has changed a lot in recent versions of Flink,
and it's so hard to put different versions of Flink's code into one process
 without any class conflicts.
The easiest way to support it would be to use a sub-process model.
For each sub-process, it's the sql gateway we are discussing now.
We can start another project (it can be outside of the Flink project)
to support it.

> For ADD JAR/REMOVE JAR, if the jar is in the local environment, we will
>just add it into the class path or remove it from the class path.
The client and the Gateway service may be on different machines.

Best,
Godfrey

LuNing Wang  于2022年5月6日周五 17:06写道：
>
> Thanks, Shengkai for driving.  And all for your discussion.
>
>
>
> > intergate the Gateway into the Flink code base
>
> After I talk with Shengkai offline and read the topic `Powering HTAP at
> ByteDance with Apache Flink` of Flink Forward Asia. I think it is better to
> integrate Gateway code into the Flink codebase.
>
>
> In the future, we can add a feature that merges SQL gateway into
> JobManager. We can request JobManager API to directly submit the Flink SQL
> job. It will further improve the performance of Flink OLAP.  In the future,
> the Flink must be a unified engine for batch, stream, and OLAP. The
> Presto/Trino directly requests the master node to submit a job, if so, we
> can reduce Q&M in Flink session mode. Perhaps, the Flink application mode
> can’t merge SQL gateway into JobManager, but Flink OLAP almost always uses
> session mode.
>
> > Gateway to support multiple Flink versions
>
>
> If we will merge the SQL gateway into JobManager, the SQL Gateway itself
> can adapt only one Flink version. We could import a Network Gateway to
> redirect requests to Gateway or JobManager of various versions. Perhaps,
> the network gateway uses other projects, like Apache Kyuubi or Zeppelin,
> etc.
>
> > I don't think that the Gateway is a 'core' function of Flink which should
>
> be included with Flink.
>
> In the production environment, Flink SQL always uses a Gateway. This point
> can be observed in the user email lists and some Flink Forward topics. The
> SQL Gateway is an important infrastructure for big data compute engine. As
> the Flink has not it, many Flink users achieve SQL Gateway in the Apache
> Kyuubi project, but it should be the work of official Flink.
>
> > I think it's fine to move this functionlity to the client rather than
>
> gateway. WDYT?
>
> I agree with the `init-file` option in the client. I think the `init-file`
> functionality in Gateway is NOT important in the first version of Gateway.
> Now, the hive JDBC option ‘initFile’ already has this functionality. After
> SQL Gateway releases and we observe feedback from the community, we maybe
> will discuss this problem again.
>
> Best,
>
> LuNing Wang.
>
>
> Shengkai Fang  于2022年5月6日周五 14:34写道：
>
> > Thanks Martijn, Nicholas, Godfrey, Jark and Jingsong feedback
> >
> > > I would like to understand why it's complicated to make the upgrades
> > > problematic
> >
> > I aggree with Jark's point. The API is not very stable in the Flink
> > actually. For example, the Gateway relies on the planner. But in
> > release-1.14 Flink renames the blink planner package. In release-1.15 Flink
> > makes the planner scala free, which means other projects should not
> > directly rely on the planner.
> >
> > >  Does the Flink SQL gateway support submitting a batch job?
> >
> > Of course. In the SQL Gateway, you can just use the sql SET
> > 'execution.runtime-mode' = 'batch' to switch to the batch environment. Then
> > the job you submit later will be executed in the batch mode.
> >
> > > The architecture of the Gateway is in the following graph.
> > Is the TableEnvironment shared for all sessions ?
> >
> > No. Every session has its individual TableEnvironment. I have modified the
> > graph to make everything more clear.
> >
> > > /v1/sessions
> > >> Are both local file and remote file supported for `libs` and `jars`?
> >
> > We don't limit the usage here. But I think we will only support the local
> > file in the next version.
> >
> > >> Does sql gateway support upload files?
> >
> > No. We need a new API to do this. We can collect more user feedback to
> > determine whether we need to implement this feature.
> >
> > >/v1/sessions/:session_handle/configure_session
> > >

Re: [DISCUSS] DockerHub repository maintainers

2022-05-06 Thread Konstantin Knauf

Hi Xintong,

it is a pity that we can only have 5 maintainers. Every (patch) release of
flink, flink-statefun, the flink-kubernetes-operator requires a maintainer
to publish the image then, if I am not mistaken. As its mostly different
groups managing the sub-projects, this is quite the bottleneck. If we give
one seat to flink-statefun maintainers, one to the
flink-kubernetes-operator maintainers, this leaves three seats for Apache
Flink core, and there is no redundancy for the other projects. When I
managed the last two patch releases, the DockerHub access was also the
biggest hurdle. Maybe we can talk to the INFRA people again. We can
certainly reduce it, but 5 is very little.

Cheers,

Konstantin






Am Fr., 6. Mai 2022 um 09:00 Uhr schrieb Till Rohrmann :

> Hi everyone,
>
> thanks for starting this discussion Xintong. I would volunteer as a
> maintainer of the flink-statefun Docker repository if you need one.
>
> Cheers,
> Till
>
> On Fri, May 6, 2022 at 6:22 AM Xintong Song  wrote:
>
>> It seems to me we at least don't have a consensus on dropping the use of
>> apache namespace, which means we need to decide on a list of maintainers
>> anyway. So maybe we can get the discussion back to the maintainers. We may
>> continue the official-image vs. apache-namespace in a separate thread if
>> necessary.
>>
>> As mentioned previously, we need to reduce the number of maintainers from
>> 20 to 5, as required by INFRA. Jingsong and I would like to volunteer as 2
>> of the 5, and we would like to learn who else wants to join us. Of course
>> the list of maintainers can be modified later.
>>
>> *This also means the current maintainers may be removed from the list.*
>> Please let us know if you still need that privilege. CC-ed all the current
>> maintainers for attention.
>>
>> Thank you~
>>
>> Xintong Song
>>
>>
>>
>> On Wed, May 4, 2022 at 3:14 PM Chesnay Schepler 
>> wrote:
>>
>> > One advantage is that the images are periodically rebuilt to get
>> > security fixes.
>> >
>> > The operator is a different story anyway because it is AFAIK only
>> > supposed to be used via docker
>> > (i.e., no standalone mode), which alleviates concerns about keeping the
>> > logic within the image
>> > to a minimum (which bit us in the past on the flink side).
>> >
>> > On 03/05/2022 16:09, Yang Wang wrote:
>> > > The flink-kubernetes-operator project is only published
>> > > via apache/flink-kubernetes-operator on docker hub and github
>> packages.
>> > > We do not find the obvious advantages by using docker hub official
>> > images.
>> > >
>> > > Best,
>> > > Yang
>> > >
>> > > Xintong Song  于2022年4月28日周四 19:27写道：
>> > >
>> > >> I agree with you that doing QA for the image after the release has
>> been
>> > >> finalized doesn't feel right. IIUR, that is mostly because official
>> > image
>> > >> PR needs 1) the binary release being deployed and propagated and 2)
>> the
>> > >> corresponding git commit being specified. I'm not completely sure
>> about
>> > >> this. Maybe we can improve the process by investigating more about
>> the
>> > >> feasibility of pre-verifying an official image PR before finalizing
>> the
>> > >> release. It's definitely a good thing to do if possible.
>> > >>
>> > >> I also agree that QA from DockerHub folks is valuable to us.
>> > >>
>> > >> I'm not against publishing official-images, and I'm not against
>> working
>> > >> closely with the DockerHub folks to improve the process of delivering
>> > the
>> > >> official image. However, I don't think these should become reasons
>> that
>> > we
>> > >> don't release our own apache/flink images.
>> > >>
>> > >> Taking the 1.12.0 as an example, admittedly it would be nice for us
>> to
>> > >> comply with the DockerHub folks' standards and not have a
>> > >> just-for-kubernetes command in our entrypoint. However, this is IMO
>> far
>> > >> less important compared to delivering the image to our users timely.
>> I
>> > >> guess that's where the DockerHub folks and us have different
>> > >> priorities, and that's why I think we should have a path that is
>> fully
>> > >> controlled by this community to deliver images. We could take their
>> > >> valuable inputs and improve afterwards. Actually, that's what we did
>> for
>> > >> 1.12.0 by starting to release to apache/flink.
>> > >>
>> > >> Thank you~
>> > >>
>> > >> Xintong Song
>> > >>
>> > >>
>> > >>
>> > >> On Thu, Apr 28, 2022 at 6:30 PM Chesnay Schepler > >
>> > >> wrote:
>> > >>
>> > >>> I still think that's mostly a process issue.
>> > >>> Of course we can be blind-sided if we do the QA for a release
>> artifact
>> > >>> after the release has been finalized.
>> > >>> But that's a clearly broken process from the get-go.
>> > >>>
>> > >>> At the very least we should already open a PR when the RC is
>> created to
>> > >>> get earlier feedback.
>> > >>>
>> > >>> Moreover, nowadays the docker images are way slimmer and we are much
>> > >>> more careful on what is actually added to the scripts.
>> > >>>
>> >

Re: [DISCUSS] FLIP-91: Support SQL Client Gateway

2022-05-06 Thread Martijn Visser

Hi everyone,

Happy to see that this discussion is very much active. Couple of comments:

> It's not about internal interfaces. Flink itself doesn't provide backward
compatibility for public APIs.

Is that so? In FLIP-196 [1] is explicitly stated "What we guarantee in
terms of stability is that a program written against a public API will
compile w/o errors when upgrading Flink (API backwards compatibility)".

> Sorry, I don't see any users requesting this feature for such a long time
for SQL Gateway.
> So you have to have a gateway to couple with the Flink version.

I would be interested to hear the opinion of users on this. I have no
strong opinion on this, I could see value in having multiple Flink version
support in a Gateway but if there's no user demand for it, then fine. I
could imagine that multiple versions support is desired by end-users, but
that it hasn't been requested yet because of the current state of the SQL
gateway. We can also simply place it out of scope for now and say that
multiple version support could be realised by another component, but with
an external component that uses the Gateway from the individual Flink
releases.

On the topic of having the Gateway in the Flink repo itself, I liked
Jingsong's argument that the SQL Client should use the Gateway, which
creates a dependency. Given that I do think the SQL Client is an important
starter functionality, that does give a compelling argument to include the
Gateway in the Flink repository. I could agree with that, as long as we do
commit that the Gateway uses properly defined public interfaces. So this
FLIP should follow FLIP-196 [1] and and FLIP-197 [2]

> In the future, the Flink must be a unified engine for batch, stream, and
OLAP.

While I understand and know the focus of some maintainers on OLAP, I do
think that the Flink project has not made any official decision that it
**must** include OLAP. The current scope is still a unified engine for
batch and streaming. As long as OLAP improvements don't hurt or cause
problems with the unified batch and streaming, there's no issue of course.
But I am careful because we know there can and will be situations where
OLAP features could conflict with unified batch and streaming (and vice
versa of course).

Best regards,

Martijn Visser
https://twitter.com/MartijnVisser82
https://github.com/MartijnVisser

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-196%3A+Source+API+stability+guarantees
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process

On Fri, 6 May 2022 at 12:05, godfrey he  wrote:

> Hi Martijn, Shengkai
>
> >I don't think that the Gateway is a 'core'
> >function of Flink which should be included with Flink.
> I have a different viewpoint. For SQL users, it provides out-of-box
> experience.
> Just like SQL-client, users can enjoy variable sql experiences after
> downloading/building a Flink package. In the design, SQL-client is
> just a client/part of SQL-gateway.
> We can find that, for HiveServer2 and Presto, the similar role is also
> built-in.
>
> > the Gateway to support multiple Flink versions
> I think this idea is good for users, who can use one service to
> support multiple Flink versions. But I do not think current design
> should support it.
> As we know, the API has changed a lot in recent versions of Flink,
> and it's so hard to put different versions of Flink's code into one process
>  without any class conflicts.
> The easiest way to support it would be to use a sub-process model.
> For each sub-process, it's the sql gateway we are discussing now.
> We can start another project (it can be outside of the Flink project)
> to support it.
>
> > For ADD JAR/REMOVE JAR, if the jar is in the local environment, we will
> >just add it into the class path or remove it from the class path.
> The client and the Gateway service may be on different machines.
>
> Best,
> Godfrey
>
> LuNing Wang  于2022年5月6日周五 17:06写道：
> >
> > Thanks, Shengkai for driving.  And all for your discussion.
> >
> >
> >
> > > intergate the Gateway into the Flink code base
> >
> > After I talk with Shengkai offline and read the topic `Powering HTAP at
> > ByteDance with Apache Flink` of Flink Forward Asia. I think it is better
> to
> > integrate Gateway code into the Flink codebase.
> >
> >
> > In the future, we can add a feature that merges SQL gateway into
> > JobManager. We can request JobManager API to directly submit the Flink
> SQL
> > job. It will further improve the performance of Flink OLAP.  In the
> future,
> > the Flink must be a unified engine for batch, stream, and OLAP. The
> > Presto/Trino directly requests the master node to submit a job, if so, we
> > can reduce Q&M in Flink session mode. Perhaps, the Flink application mode
> > can’t merge SQL gateway into JobManager, but Flink OLAP almost always
> uses
> > session mode.
> >
> > > Gateway to support multiple Flink versions
> >
> >
> > If we will merge the SQL gateway into JobManager,

[jira] [Created] (FLINK-27533) Unstable AdaptiveSchedulerSimpleITCase#testJobCancellationWhileRestartingSucceeds

2022-05-06 Thread Chesnay Schepler (Jira)

Chesnay Schepler created FLINK-27533:


 Summary: Unstable 
AdaptiveSchedulerSimpleITCase#testJobCancellationWhileRestartingSucceeds
 Key: FLINK-27533
 URL: https://issues.apache.org/jira/browse/FLINK-27533
 Project: Flink
  Issue Type: Technical Debt
  Components: Runtime / Coordination, Tests
Affects Versions: 1.16.0
Reporter: Chesnay Schepler
 Fix For: 1.16.0


https://dev.azure.com/chesnay/flink/_build/results?buildId=2599&view=logs&j=9dc1b5dc-bcfa-5f83-eaa7-0cb181ddc267&t=511d2595-ec54-5ab7-86ce-92f328796f20

{code}
May 06 10:30:22 [ERROR] 
org.apache.flink.runtime.scheduler.adaptive.AdaptiveSchedulerSimpleITCase.testJobCancellationWhileRestartingSucceeds
  Time elapsed: 0.836 s  <<< ERROR!
May 06 10:30:22 org.apache.flink.util.FlinkException: Exhausted retry attempts.
May 06 10:30:22 at 
org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:173)
May 06 10:30:22 at 
org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:158)
May 06 10:30:22 at 
org.apache.flink.runtime.scheduler.adaptive.AdaptiveSchedulerSimpleITCase.testJobCancellationWhileRestartingSucceeds(AdaptiveSchedulerSimpleITCase.java:128)
May
{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Re: [DISCUSS] FLIP-224: Blacklist Mechanism

2022-05-06 Thread Martijn Visser

> If we only support to block nodes manually, then I could not see
the obvious advantages compared with current SRE's approach(via *yarn
rmadmin or kubectl taint*).

I agree with Yang Wang on this.

>  To me this sounds yet again like one of those magical mechanisms that
will rarely work just right.

I also agree with Chesnay that magical mechanisms are indeed super hard to
get right.

Best regards,

Martijn

On Fri, 6 May 2022 at 12:03, Jiangang Liu  wrote:

> Thanks for the valuable design. The auto-detecting can decrease great work
> for us. We have implemented the similar feature in our inner flink version.
> Below is something that I care about:
>
>1. For auto-detecting, I wonder how to make the strategy and mark a node
>blocked? Sometimes the blocked node is hard to be detected, for example,
>the upper node or the down node will be blocked when network
> unreachable.
>2. I see that the strategy is made in JobMaster side. How about
>implementing the similar logic in resource manager? In session mode,
> multi
>jobs can fail on the same bad node and the node should be marked
> blocked.
>If the job makes the strategy, the node may be not marked blocked if the
>fail times don't exceed the threshold.
>
>
> Zhu Zhu  于2022年5月5日周四 23:35写道：
>
> > Thank you for all your feedback!
> >
> > Besides the answers from Lijie, I'd like to share some of my thoughts:
> > 1. Whether to enable automatical blocklist
> > Generally speaking, it is not a goal of FLIP-224.
> > The automatical way should be something built upon the blocklist
> > mechanism and well decoupled. It was designed to be a configurable
> > blocklist strategy, but I think we can further decouple it by
> > introducing a abnormal node detector, as Becket suggested, which just
> > uses the blocklist mechanism once bad nodes are detected. However, it
> > should be a separate FLIP with further dev discussions and feedback
> > from users. I also agree with Becket that different users have different
> > requirements, and we should listen to them.
> >
> > 2. Is it enough to just take away abnormal nodes externally
> > My answer is no. As Lijie has mentioned, we need a way to avoid
> > deploying tasks to temporary hot nodes. In this case, users may just
> > want to limit the load of the node and do not want to kill all the
> > processes on it. Another case is the speculative execution[1] which
> > may also leverage this feature to avoid starting mirror tasks on slow
> > nodes.
> >
> > Thanks,
> > Zhu
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+execution+for+Batch+Job
> >
> > Lijie Wang  于2022年5月5日周四 15:56写道：
> >
> > >
> > > Hi everyone,
> > >
> > >
> > > Thanks for your feedback.
> > >
> > >
> > > There's one detail that I'd like to re-emphasize here because it can
> > affect the value and design of the blocklist mechanism (perhaps I should
> > highlight it in the FLIP). We propose two actions in FLIP:
> > >
> > > 1) MARK_BLOCKLISTED: Just mark the task manager or node as blocked.
> > Future slots should not be allocated from the blocked task manager or
> node.
> > But slots that are already allocated will not be affected. A typical
> > application scenario is to mitigate machine hotspots. In this case, we
> hope
> > that subsequent resource allocations will not be on the hot machine, but
> > tasks currently running on it should not be affected.
> > >
> > > 2) MARK_BLOCKLISTED_AND_EVACUATE_TASKS: Mark the task manager or node
> as
> > blocked, and evacuate all tasks on it. Evacuated tasks will be restarted
> on
> > non-blocked task managers.
> > >
> > > For the above 2 actions, the former may more highlight the meaning of
> > this FLIP, because the external system cannot do that.
> > >
> > >
> > > Regarding *Manually* and *Automatically*, I basically agree with
> @Becket
> > Qin: different users have different answers. Not all users’ deployment
> > environments have a special external system that can perform the anomaly
> > detection. In addition, adding pluggable/optional auto-detection doesn't
> > require much extra work on top of manual specification.
> > >
> > >
> > > I will answer your other questions one by one.
> > >
> > >
> > > @Yangze
> > >
> > > a) I think you are right, we do not need to expose the
> > `cluster.resource-blocklist.item.timeout-check-interval` to users.
> > >
> > > b) We can abstract the `notifyException` to a separate interface (maybe
> > BlocklistExceptionListener), and the ResourceManagerBlocklistHandler can
> > implement it in the future.
> > >
> > >
> > > @Martijn
> > >
> > > a) I also think the manual blocking should be done by cluster
> operators.
> > >
> > > b) I think manual blocking makes sense, because according to my
> > experience, users are often the first to perceive the machine problems
> > (because of job failover or delay), and they will contact cluster
> operators
> > to solve it, or even tell the cluster operators which machine is
> > prob

Re: Facing Difficulty in Deseralizing the Avro messages

2022-05-06 Thread Martijn Visser

Hi Vishist,

For future posts, please don't cross-post to both Stackoverflow and the
mailing list. This only causes extra work for volunteers who try to help.

Best regards,

Martijn

On Fri, 6 May 2022 at 09:41, Vishist Bhoopalam 
wrote:

> HI
>
> I have explained the problem in the depth in stack Overflow hoping for
> quick response since have a dead line to catch.
>
>
> https://stackoverflow.com/questions/72137036/how-to-deseralize-avro-response-getting-from-datastream-scala-apache-flink
>
> Thanks and Regards
> BR Vishist

Re: [DISCUSS] FLIP-223: Support HiveServer2 Endpoint

2022-05-06 Thread Martijn Visser

Hi Shengkai,

Thanks for clarifying.

Best regards,

Martijn

On Fri, 6 May 2022 at 08:40, Shengkai Fang  wrote:

> Hi Martijn.
>
> > So this implementation would not rely in any way on Hive, only on Thrift?
>
> Yes.  The dependency is light. We also can just copy the iface file from
> the Hive repo and maintain by ourselves.
>
> Best,
> Shengkai
>
> Martijn Visser  于2022年5月4日周三 21:44写道：
>
> > Hi Shengkai,
> >
> > > Actually we will only rely on the API in the Hive, which only contains
> > the thrift file and the generated code
> >
> > So this implementation would not rely in any way on Hive, only on Thrift?
> >
> > Best regards,
> >
> > Martijn Visser
> > https://twitter.com/MartijnVisser82
> > https://github.com/MartijnVisser
> >
> >
> > On Fri, 29 Apr 2022 at 05:16, Shengkai Fang  wrote:
> >
> > > Hi, Jark and Martijn
> > >
> > > Thanks for your feedback.
> > >
> > > > Kyuubi provides three ways to configure Hive metastore [1]. Could we
> > > provide similar abilities?
> > >
> > > Yes. I have updated the FLIP about this and it takes some time to
> figure
> > > out how the jdbc driver works. I added the section about how to use the
> > > hive JDBC to configure the session-level catalog.
> > >
> > > > I think we can improve the "HiveServer2 Compatibility" section.
> > >
> > > Yes. I have updated the FLIP and added more details about the
> > > compatibility.
> > >
> > > >  Prefer to first complete the discussion and vote on FLIP-91 then
> > discuss
> > > FLIP-223
> > >
> > > Of course. We can wait until the discussion of the FLIP-91 finishes.
> > >
> > > > Maintenance concerns about the hive
> > >
> > > Actually we will only rely on the API in the Hive, which only contains
> > the
> > > thrift file and the generated code[1]. I think it will not influence us
> > to
> > > upgrade the java version.
> > >
> > > [1] https://github.com/apache/hive/tree/master/service-rpc
> > >
> > > Best,
> > > Shengkai
> > >
> > > Martijn Visser  于2022年4月26日周二 20:44写道：
> > >
> > > > Hi all,
> > > >
> > > > I'm not too familiar with Hive and HiveServer2, but I do have a
> couple
> > of
> > > > questions/concerns:
> > > >
> > > > 1. What is the relationship between this FLIP and FLIP-91? My
> > assumption
> > > > would be that this FLIP (and therefore the HiveServer2)
> implementation
> > > > would need to be integrated in the REST Gateway, is that correct? If
> > so,
> > > I
> > > > would prefer to first complete the discussion and vote on FLIP-91,
> else
> > > > we'll have two moving FLIPs who have a direct relationship with each
> > > other.
> > > >
> > > > 2. While I understand that Hive is important (in the Chinese
> ecosystem,
> > > not
> > > > so much in Europe and the US), I still have maintenance concerns on
> > this
> > > > topic. We know that the current Hive integration isn't exactly ideal
> > and
> > > > requires a lot of work to get in better shape. At the same time, Hive
> > > still
> > > > doesn't support Java 11 while we need (and should, given the premier
> > > > support has ended already) to move away from Java 8.
> > > >
> > > > Best regards,
> > > >
> > > > Martijn Visser
> > > > https://twitter.com/MartijnVisser82
> > > > https://github.com/MartijnVisser
> > > >
> > > >
> > > > On Mon, 25 Apr 2022 at 12:13, Jark Wu  wrote:
> > > >
> > > > > Thank Shengkai for driving this effort,
> > > > > I think this is an essential addition to Flink Batch.
> > > > >
> > > > > I have some small suggestions:
> > > > > 1) Kyuubi provides three ways to configure Hive metastore [1].
> Could
> > we
> > > > > provide similar abilities?
> > > > > Especially with the JDBC Connection URL, users can visit different
> > Hive
> > > > > metastore server instances.
> > > > >
> > > > > 2) I think we can improve the "HiveServer2 Compatibility" section.
> > > > > We need to figure out two compatibility matrices. One is SQL
> Gateway
> > > with
> > > > > different versions of Hive metastore,
> > > > > and the other is different versions of Hive client (e.g., Hive
> JDBC)
> > > with
> > > > > SQL Gateway. We need to clarify
> > > > > what metastore and client versions we support and how users
> configure
> > > the
> > > > > versions.
> > > > >
> > > > > Best,
> > > > > Jark
> > > > >
> > > > >
> > > > > [1]:
> > > > >
> > > > >
> > > >
> > >
> >
> https://kyuubi.apache.org/docs/r1.3.1-incubating/deployment/hive_metastore.html#activate-configurations
> > > > >
> > > > > On Sun, 24 Apr 2022 at 15:02, Shengkai Fang 
> > wrote:
> > > > >
> > > > > > Hi, Jiang.
> > > > > >
> > > > > > Thanks for your feedback!
> > > > > >
> > > > > > > Integrating the Hive ecosystem should not require changing the
> > > > service
> > > > > > interface
> > > > > >
> > > > > > I move the API change to the FLIP-91. But I think it's possible
> we
> > > add
> > > > > more
> > > > > > interfaces to intergrate the new endpoints in the future because
> > > every
> > > > > > endpoints's functionality is different. For example, the REST
> > > endpoint
> > > > > > doen't su

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

2022-05-06 Thread Martijn Visser

Hi Paul,

Great that you could find something in the SQL standard! I'll try to read
the FLIP once more completely next week to see if I have any more concerns.

Best regards,

Martijn

On Fri, 6 May 2022 at 08:21, Paul Lam  wrote:

> I had a look at SQL-2016 that Martijn mentioned, and found that
> maybe we could follow the transaction savepoint syntax.
>
>- SAVEPOINT 
>- RELEASE SAVEPOINT 
>
> These savepoint statements are supported in lots of databases, like
> Oracle[1], PG[2], MariaDB[3].
>
> They’re usually used in the middle of a SQL transaction, so the target
> would be the current transaction. But if used in Flink SQL session, we
> need to add a JOB/QUERY id when create a savepoint, thus the syntax
> would be:
>
>- SAVEPOINT  
>- RELEASE SAVEPOINT 
>
> I’m adding it as an alternative in the FLIP.
>
> [1]
> https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_10001.htm
> [2] https://www.postgresql.org/docs/current/sql-savepoint.html
> [3] https://mariadb.com/kb/en/savepoint/
>
> Best,
> Paul Lam
>
> 2022年5月4日 16:42，Paul Lam  写道：
>
> Hi Shengkai,
>
> Thanks a lot for your input!
>
> > I just wonder how the users can get the web ui in the application mode.
> Therefore, it's better we can list the Web UI using the SHOW statement.
> WDYT?
>
> I think it's a valid approach. I'm adding it to the FLIP.
>
> > After the investigation, I am fine with the QUERY but the keyword JOB is
> also okay to me.
>
> In addition, CockroachDB has both SHOW QUERIES [1] and SHOW JOBS [2],
> while the former shows the active running queries and the latter shows the
> background tasks like schema changes. FYI.
>
> WRT the questions:
>
> > 1. Could you add some details about the behaviour with the different
> execution.target, e.g. session, application mode?
>
> IMHO, the difference between different `execution.target` is mostly about
> cluster startup, which has little relation with the proposed statements.
> These statements rely on the current ClusterClient/JobClient API,
> which is deployment mode agnostic. Canceling a job in an application
> cluster is the same as in a session cluster.
>
> BTW, application mode is still in the development progress ATM [3].
>
> > 2. Considering the SQL Client/Gateway is not limited to submitting the
> job
> to the specified cluster, is it able to list jobs in the other clusters?
>
> I think multi-cluster support in SQL Client/Gateway should be aligned with
> CLI, at least at the early phase. We may use SET  to set a cluster id for
> a
> session, then we have access to the cluster. However,  every SHOW
> statement would only involve one cluster.
>
> Best,
> Paul Lam
>
> [1] https://www.cockroachlabs.com/docs/stable/show-statements.html
> [2] https://www.cockroachlabs.com/docs/v21.2/show-jobs
> [3] https://issues.apache.org/jira/browse/FLINK-26541
>
> Shengkai Fang  于2022年4月29日周五 15:36写道：
>
>> Hi.
>>
>> Thanks for Paul's update.
>>
>> > It's better we can also get the infos about the cluster where the job is
>> > running through the DESCRIBE statement.
>>
>> I just wonder how the users can get the web ui in the application mode.
>> Therefore, it's better we can list the Web UI using the SHOW statement.
>> WDYT?
>>
>>
>> > QUERY or other keywords.
>>
>> I list the statement to manage the lifecycle of the query/dml in other
>> systems:
>>
>> Mysql[1] allows users to SHOW [FULL] PROCESSLIST and use the KILL command
>> to kill the query.
>>
>> ```
>> mysql> SHOW PROCESSLIST;
>>
>> mysql> KILL 27;
>> ```
>>
>>
>> Postgres use the following statements to kill the queries.
>>
>> ```
>> SELECT pg_cancel_backend()
>>
>> SELECT pg_terminate_backend()
>> ```
>>
>> KSQL uses the following commands to control the query lifecycle[4].
>>
>> ```
>> SHOW QUERIES;
>>
>> TERMINATE ;
>>
>> ```
>>
>> [1] https://dev.mysql.com/doc/refman/8.0/en/show-processlist.html
>> [2] https://scaledynamix.com/blog/how-to-kill-mysql-queries/
>> [3]
>>
>> https://stackoverflow.com/questions/35319597/how-to-stop-kill-a-query-in-postgresql
>> [4]
>>
>> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/
>> [5]
>>
>> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/terminate/
>>
>> After the investigation, I am fine with the QUERY but the keyword JOB is
>> also okay to me.
>>
>> We also have two questions here.
>>
>> 1. Could you add some details about the behaviour with the different
>> execution.target, e.g. session, application mode?
>>
>> 2. Considering the SQL Client/Gateway is not limited to submitting the job
>> to the specified cluster, is it able to list jobs in the other clusters?
>>
>>
>> Best,
>> Shengkai
>>
>> Paul Lam  于2022年4月28日周四 17:17写道：
>>
>> > Hi Martjin,
>> >
>> > Thanks a lot for your reply! I agree that the scope may be a bit
>> confusing,
>> > please let me clarify.
>> >
>> > The FLIP aims to add new SQL statements that are supported only in
>> > sql-client, similar to
>> > jar statements [1]. Jar statements can be parsed into jar op

Re: Emitting metrics from Flink SQL LookupTableSource

2022-05-06 Thread Martijn Visser

Hi Santhosh,

There's currently an ongoing discussion on this topic in the Dev mailing
list, see https://lists.apache.org/thread/dqw5jw4hmyct47j7m13vdfqcdnbgq0lw

Best regards,

Martijn Visser
https://twitter.com/MartijnVisser82
https://github.com/MartijnVisser


On Fri, 6 May 2022 at 04:31, santhosh venkat 
wrote:

> Hi,
>
> We are trying to develop Flink SQL connectors in my company for proprietary
> data-stores. One problem we observed is that the Flink-SQL
> LookupTablesource/LookupFunction does not seem to have capabilities to emit
> any metrics(i.e there is no metric group wired into either through
> LookupSourceContext or DynamicSourceContext). It would be great to expose
> latency and throughput metrics from these table sinks for monitoring.
>
> I looked at the existing lookuptablesource implementations in open source
> Flink. I noticed that none of them were emitting any metrics.  Does such a
> capability exist? Please let me know if I'm missing something.
>
> Thanks.
>

Re: [DISCUSS] FLIP-228: Support Within between events in CEP Pattern

2022-05-06 Thread yue ma

hi Nicholas ,


Nicholas  于2022年5月5日周四 14:28写道：

> Hi everyone,
>
>
>
>
> Pattern#withIn interface in CEP defines the maximum time interval in which
> a matching pattern has to be completed in order to be considered valid,
> which interval corresponds to the maximum time gap between first and the
> last event. The interval representing the maximum time gap between events
> is required to define in the scenario like purchasing good within a maximum
> of 5 minutes after browsing.
>
>
>
>
> I would like to start a discussion about FLIP-228[1], in which within
> between events is proposed in Pattern to support the definition of the
> maximum time interval in which a completed partial matching pattern is
> considered valid, which interval represents the maximum time gap between
> events for partial matching Pattern.
>
>
>
>
> Hence we propose the Pattern#partialWithin interface to define the maximum
> time interval in which a completed partial matching pattern is considered
> valid. Please take a look at the FLIP page [1] to get more details. Any
> feedback about the FLIP-228 would be appreciated!
>
>
>
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-228%3A+Support+Within+between+events+in+CEP+Pattern
>
>
>
>
> Best regards,
>
> Nicholas Jiang

Re: [DISCUSS] FLIP-228: Support Within between events in CEP Pattern

2022-05-06 Thread yue ma

hi Nicholas,

Thanks for bringing this discussion, we also think it's a useful feature.
Some fine-grained timeout pattern matching  can be implemented in CEP which
makes Flink CEP more powerful

Nicholas  于2022年5月5日周四 14:28写道：

> Hi everyone,
>
>
>
>
> Pattern#withIn interface in CEP defines the maximum time interval in which
> a matching pattern has to be completed in order to be considered valid,
> which interval corresponds to the maximum time gap between first and the
> last event. The interval representing the maximum time gap between events
> is required to define in the scenario like purchasing good within a maximum
> of 5 minutes after browsing.
>
>
>
>
> I would like to start a discussion about FLIP-228[1], in which within
> between events is proposed in Pattern to support the definition of the
> maximum time interval in which a completed partial matching pattern is
> considered valid, which interval represents the maximum time gap between
> events for partial matching Pattern.
>
>
>
>
> Hence we propose the Pattern#partialWithin interface to define the maximum
> time interval in which a completed partial matching pattern is considered
> valid. Please take a look at the FLIP page [1] to get more details. Any
> feedback about the FLIP-228 would be appreciated!
>
>
>
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-228%3A+Support+Within+between+events+in+CEP+Pattern
>
>
>
>
> Best regards,
>
> Nicholas Jiang

Re: [DISCUSS] DockerHub repository maintainers

2022-05-06 Thread Xintong Song

@Till,

Thanks for volunteering.

@Konstantin,

>From my experience, the effort that requires DockerHub access in the main
project release process is quite limited. I helped Yun Gao on releasing the
1.15.0 images, and what I did was just check out the `flink-docker` repo
and run the release script, that's it. If all the sub-projects are as easy
as the main project, then it's probably ok that only a small group of
people have access. Concerning the redundancy, if a maintainer from a
sub-project is temporarily unreachable, I believe the other maintainers
would be glad to help.

It would of course be good to have more seats. I just haven't come up with
good reasons to persuade the INFRA folks. What's your suggestions?


Thank you~

Xintong Song



On Fri, May 6, 2022 at 6:38 PM Konstantin Knauf  wrote:

> Hi Xintong,
>
> it is a pity that we can only have 5 maintainers. Every (patch) release of
> flink, flink-statefun, the flink-kubernetes-operator requires a maintainer
> to publish the image then, if I am not mistaken. As its mostly different
> groups managing the sub-projects, this is quite the bottleneck. If we give
> one seat to flink-statefun maintainers, one to the
> flink-kubernetes-operator maintainers, this leaves three seats for Apache
> Flink core, and there is no redundancy for the other projects. When I
> managed the last two patch releases, the DockerHub access was also the
> biggest hurdle. Maybe we can talk to the INFRA people again. We can
> certainly reduce it, but 5 is very little.
>
> Cheers,
>
> Konstantin
>
>
>
>
>
>
> Am Fr., 6. Mai 2022 um 09:00 Uhr schrieb Till Rohrmann <
> trohrm...@apache.org
> >:
>
> > Hi everyone,
> >
> > thanks for starting this discussion Xintong. I would volunteer as a
> > maintainer of the flink-statefun Docker repository if you need one.
> >
> > Cheers,
> > Till
> >
> > On Fri, May 6, 2022 at 6:22 AM Xintong Song 
> wrote:
> >
> >> It seems to me we at least don't have a consensus on dropping the use of
> >> apache namespace, which means we need to decide on a list of maintainers
> >> anyway. So maybe we can get the discussion back to the maintainers. We
> may
> >> continue the official-image vs. apache-namespace in a separate thread if
> >> necessary.
> >>
> >> As mentioned previously, we need to reduce the number of maintainers
> from
> >> 20 to 5, as required by INFRA. Jingsong and I would like to volunteer
> as 2
> >> of the 5, and we would like to learn who else wants to join us. Of
> course
> >> the list of maintainers can be modified later.
> >>
> >> *This also means the current maintainers may be removed from the list.*
> >> Please let us know if you still need that privilege. CC-ed all the
> current
> >> maintainers for attention.
> >>
> >> Thank you~
> >>
> >> Xintong Song
> >>
> >>
> >>
> >> On Wed, May 4, 2022 at 3:14 PM Chesnay Schepler 
> >> wrote:
> >>
> >> > One advantage is that the images are periodically rebuilt to get
> >> > security fixes.
> >> >
> >> > The operator is a different story anyway because it is AFAIK only
> >> > supposed to be used via docker
> >> > (i.e., no standalone mode), which alleviates concerns about keeping
> the
> >> > logic within the image
> >> > to a minimum (which bit us in the past on the flink side).
> >> >
> >> > On 03/05/2022 16:09, Yang Wang wrote:
> >> > > The flink-kubernetes-operator project is only published
> >> > > via apache/flink-kubernetes-operator on docker hub and github
> >> packages.
> >> > > We do not find the obvious advantages by using docker hub official
> >> > images.
> >> > >
> >> > > Best,
> >> > > Yang
> >> > >
> >> > > Xintong Song  于2022年4月28日周四 19:27写道：
> >> > >
> >> > >> I agree with you that doing QA for the image after the release has
> >> been
> >> > >> finalized doesn't feel right. IIUR, that is mostly because official
> >> > image
> >> > >> PR needs 1) the binary release being deployed and propagated and 2)
> >> the
> >> > >> corresponding git commit being specified. I'm not completely sure
> >> about
> >> > >> this. Maybe we can improve the process by investigating more about
> >> the
> >> > >> feasibility of pre-verifying an official image PR before finalizing
> >> the
> >> > >> release. It's definitely a good thing to do if possible.
> >> > >>
> >> > >> I also agree that QA from DockerHub folks is valuable to us.
> >> > >>
> >> > >> I'm not against publishing official-images, and I'm not against
> >> working
> >> > >> closely with the DockerHub folks to improve the process of
> delivering
> >> > the
> >> > >> official image. However, I don't think these should become reasons
> >> that
> >> > we
> >> > >> don't release our own apache/flink images.
> >> > >>
> >> > >> Taking the 1.12.0 as an example, admittedly it would be nice for us
> >> to
> >> > >> comply with the DockerHub folks' standards and not have a
> >> > >> just-for-kubernetes command in our entrypoint. However, this is IMO
> >> far
> >> > >> less important compared to delivering the image to our users
> timely.
>

Re: [Discuss] Creating an Apache Flink slack workspace

2022-05-06 Thread Piotr Nowojski

Hi Xintong,

I'm not sure if slack is the right tool for the job. IMO it works great as
an adhoc tool for discussion between developers, but it's not searchable
and it's not persistent. Between devs, it works fine, as long as the result
of the ad hoc discussions is backported to JIRA/mailing list/design doc.
For users, that simply would be extremely difficult to achieve. In the
result, I would be afraid we are answering the same questions over, and
over and over again, without even a way to provide a link to the previous
thread, because nobody can search for it .

I'm +1 for having an open and shared slack space/channel for the
contributors, but I think I would be -1 for such channels for the users.

For users, I would prefer to focus more on, for example, stackoverflow.
With upvoting, clever sorting of the answers (not the oldest/newest at top)
it's easily searchable - those features make it fit our use case much
better IMO.

Best,
Piotrek



pt., 6 maj 2022 o 11:08 Xintong Song  napisał(a):

> Thank you~
>
> Xintong Song
>
>
>
> -- Forwarded message -
> From: Xintong Song 
> Date: Fri, May 6, 2022 at 5:07 PM
> Subject: Re: [Discuss] Creating an Apache Flink slack workspace
> To: private 
> Cc: Chesnay Schepler 
>
>
> Hi Chesnay,
>
> Correct me if I'm wrong, I don't find this is *repeatedly* discussed on the
> ML. The only discussions I find are [1] & [2], which are 4 years ago. On
> the other hand, I do find many users are asking questions about whether
> Slack should be supported [2][3][4]. Besides, I also find a recent
> discussion thread from ComDev [5], where alternative communication channels
> are being discussed. It seems to me ASF is quite open to having such
> additional channels and they have been worked well for many projects
> already.
>
> I see two reasons for brining this discussion again:
> 1. There are indeed many things that have change during the past 4 years.
> We have more contributors, including committers and PMC members, and even
> more users from various organizations and timezones. That also means more
> discussions and Q&As are happening.
> 2. The proposal here is different from the previous discussion. Instead of
> maintaining a channel for Flink in the ASF workspace, here we are proposing
> to create a dedicated Apache Flink slack workspace. And instead of *moving*
> the discussion to Slack, we are proposing to add a Slack Workspace as an
> addition to the ML.
>
> Below is your opinions that I found from your previous -1 [1]. IIUR, these
> are all about the using ASF Slack Workspace. If I overlooked anything,
> please let me know.
>
> > 1. According to INFRA-14292 <
> > https://issues.apache.org/jira/browse/INFRA-14292> the ASF Slack isn't
> > run by the ASF. This alone puts this service into rather questionable
> > territory as it /looks/ like an official ASF service. If anyone can
> provide
> > information to the contrary, please do so.
>
> 2. We already discuss things on the mailing lists, JIRA and GitHub. All of
> > these are available to the public, whereas the slack channel requires an
> > @apache mail address, i.e. you have to be a committer. This minimizes the
> > target audience rather significantly. I would much rather prefer
> something
> > that is also available to contributors.
>
>
> I do agree this should be decided by the whole community. I'll forward this
> to dev@ and user@ ML.
>
> Thank you~
>
> Xintong Song
>
>
> [1] https://lists.apache.org/thread/gxwv49ssq82g06dbhy339x6rdxtlcv3d
> [2] https://lists.apache.org/thread/kcym1sozkrtwxw1fjbnwk1nqrrlzolcc
> [3] https://lists.apache.org/thread/7rmd3ov6sv3wwhflp97n4czz25hvmqm6
> [4] https://lists.apache.org/thread/n5y1kzv50bkkbl3ys494dglyxl45bmts
> [5] https://lists.apache.org/thread/fzwd3lj0x53hkq3od5ot0y719dn3kj1j
>
> On Fri, May 6, 2022 at 3:05 PM Chesnay Schepler 
> wrote:
>
> > This has been repeatedly discussed on the ML over the years and was
> > rejected every time.
> >
> > I don't see that anything has changed that would invalidate the
> previously
> > raised arguments against it, so I'm still -1 on it.
> >
> > This is also not something the PMC should decide anyway, but the project
> > as a whole.
> >
> > On 06/05/2022 06:48, Jark Wu wrote:
> >
> > Thank Xintong, for starting this exciting topic.
> >
> > I think Slack would be an essential addition to the mailing list.
> > I have talked with some Flink users, and they are surprised
> > Flink doesn't have Slack yet, and they would love to use Slack.
> > We can also see a trend that new open-source communities
> > are using Slack as the community base camp.
> >
> > Slack is also helpful for brainstorming and asking people for opinions
> and
> > use cases.
> > I think Slack is not only another place for Q&A but also a connection to
> > the Flink users.
> > We can create more channels to make the community have more social
> > attributes, for example,
> >  - Share ideas, projects, integrations, articles, and presentations
> > related to Flink

Re: [DISCUSS] FLIP-220: Temporal State

2022-05-06 Thread Nico Kruber

While working a bit more on this, David A and I noticed a couple of things 
that were not matching each other in the proposed APIs:

a) the proposed BinarySortedMultiMapState class didn't actually have getters 
that return multiple items per key, and
b) while having a single multi-map like implementation in the backend, with 
the adapted API, we'd like to put it up for discussion again whether we maybe 
want to have a user-facing BinarySortedMapState API as well which can be 
simpler but doesn't add any additional constraints to the state backends.

Let me go into details a bit more:
in a multi-map, a single key can be backed by a set of items and as such, the 
atomic unit that should be retrievable is not a single item but rather 
something like a Collection, an Iterable , a List, or so. Since we are already 
using Iterable in the main API, how about the following?
```
public class BinarySortedMultiMapState extends State {
  void put(UK key, Iterable values) throws Exception;
  void add(UK key, UV value) throws Exception;

  Iterable valueAt(UK key) throws Exception;

  Map.Entry> firstEntry() throws Exception;
  Map.Entry> lastEntry() throws Exception;

  Iterable>> readRange(UK fromKey, UK toKey) throws 
Exception;
  Iterable>> readRangeUntil(UK endUserKey) throws 
Exception;
  Iterable>> readRangeFrom(UK startUserKey) throws 
Exception;

  void clearRange(UK fromKey, UK toKey) throws Exception;
  void clearRangeUntil(UK endUserKey) throws Exception;
  void clearRangeFrom(UK startUserKey) throws Exception;
}
```

We also considered using Iterable> instead of Map.Entry>, but that wouldn't match well with firstEntry() and lastEntry() 
because for a single key, there is not a single first/last value. We also 
looked at common MultiMap insterfaces and their getters were also always 
retrieving the whole list/collection for a key. Since we don't want to promise 
too many details to the user, we believe, Iterable is our best choice for now 
but that can also be "upgraded" to, e.g., List in the future without breaking 
client code.

An appropriate map-like version of that would be the following:
```
public class BinarySortedMapState extends State {
  void put(UK key, UV value) throws Exception;

  UV valueAt(UK key) throws Exception;

  Map.Entry firstEntry() throws Exception;
  Map.Entry lastEntry() throws Exception;

  Iterable> readRange(UK fromKey, UK toKey) throws 
Exception;
  Iterable> readRangeUntil(UK endUserKey) throws Exception;
  Iterable> readRangeFrom(UK startUserKey) throws Exception;

  void clearRange(UK fromKey, UK toKey) throws Exception;
  void clearRangeUntil(UK endUserKey) throws Exception;
  void clearRangeFrom(UK startUserKey) throws Exception;
}
```

We believe, we were also missing details regarding the state descriptor and 
I'm still a bit fuzzy on what to provide as type T in StateDescriptor.
For the constructors, however, since we'd require a 
LexicographicalTypeSerializer implementation, we would propose the following 
three overloads similar to the MapStateDescriptor:
```
public class BinarySortedMultiMapStateDescriptor extends 
StateDescriptor, Map>/*?*/> {

public BinarySortedMapStateDescriptor(
String name, LexicographicalTypeSerializer keySerializer, 
TypeSerializer valueSerializer) {}

public BinarySortedMapStateDescriptor(
String name, LexicographicalTypeSerializer keySerializer, 
TypeInformation valueTypeInfo) {}

public BinarySortedMapStateDescriptor(
String name, LexicographicalTypeSerializer keySerializer, 
Class valueClass) {}
}
```
Technically, we could have a LexicographicalTypeInformation as well (for the 
2nd overload) but don't currently see the need for that wrapper since these 
serializers are just needed for State - but maybe someone with more insights 
into this topic can advise.

A few further points to to with respect to the implementation:
- we'll have to find a suitable heap-based state backend implementation that 
integrates well with all current efforts (needs to be discussed)
- the StateProcessor API will have to receive appropriate extensions to read 
and write this new form of state

Nico

On Friday, 29 April 2022 14:25:59 CEST Nico Kruber wrote:
> Hi all,
> Yun, David M, David A, and I had an offline discussion and talked through a
> couple of details that emerged from the discussion here. We believe, we have
> found a consensus on these points and would like to share our points for
> further feedback:
> 
> Let me try to get through the points that were opened in arbitrary order:
> 
> 
> 1. We want to offer a generic interface for sorted state, not just temporal
> state as proposed initially. We would like to...
> a) ...offer a single new state type similar to what TemporalListState was
> offering (so not offering something like TemporalValueState to keep the API
> slim).
> b) ...name it BinarySortedMultiMap with Java-Object keys and values
> (I'll go into the API further below) - the naming stress

[jira] [Created] (FLINK-27534) Apply scalafmt to 1.15 branch

2022-05-06 Thread Timo Walther (Jira)

Timo Walther created FLINK-27534:


 Summary: Apply scalafmt to 1.15 branch
 Key: FLINK-27534
 URL: https://issues.apache.org/jira/browse/FLINK-27534
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Reporter: Timo Walther
Assignee: Timo Walther


As discussed on the mailing list:

https://lists.apache.org/thread/9jznwjh73jhcncnx46531kzyr0q7pz90

We backport scalafmt to 1.15 to ease merging of patches.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (FLINK-27535) Optimize the unit test execution time

2022-05-06 Thread Dong Lin (Jira)

Dong Lin created FLINK-27535:


 Summary: Optimize the unit test execution time
 Key: FLINK-27535
 URL: https://issues.apache.org/jira/browse/FLINK-27535
 Project: Flink
  Issue Type: Improvement
  Components: Library / Machine Learning
Affects Versions: ml-2.1.0
Reporter: Dong Lin


Currently `mvn package` takes 10 minutes to complete in Github actions. A lot 
of time is spent in running unit tests for algorithms. For example, 
LogisticRegressionTest takes 82 seconds and KMeansTest takes 43 seconds in [1]. 

This time appears to be more than expected. And it will considerably reduce 
developer velocity if a developer needs to wait for hours to get test results 
once we have 100+ algorithms in Flink ML.

We should understand why it takes 82 seconds to run e.g. LogisticRegressionTest 
and see if there is a way to optimize the test execution time.

[1] https://github.com/apache/flink-ml/runs/6319402103?check_suite_focus=true.





--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Re: [Discuss] Creating an Apache Flink slack workspace

2022-05-06 Thread Martijn Visser

Hi everyone,

While I see Slack having a major downside (the results are not indexed by
external search engines, you can't link directly to Slack content unless
you've signed up), I do think that the open source space has progressed and
that Slack is considered as something that's invaluable to users. There are
other Apache programs that also run it, like Apache Airflow [1]. I also see
it as a potential option to create a more active community.

A concern I can see is that users will start DMing well-known
reviewers/committers to get a review or a PR merged. That can cause a lot
of noise. I can go +1 for Slack, but then we need to establish a set of
community rules.

Best regards,

Martijn

[1] https://airflow.apache.org/community/

On Fri, 6 May 2022 at 13:59, Piotr Nowojski  wrote:

> Hi Xintong,
>
> I'm not sure if slack is the right tool for the job. IMO it works great as
> an adhoc tool for discussion between developers, but it's not searchable
> and it's not persistent. Between devs, it works fine, as long as the result
> of the ad hoc discussions is backported to JIRA/mailing list/design doc.
> For users, that simply would be extremely difficult to achieve. In the
> result, I would be afraid we are answering the same questions over, and
> over and over again, without even a way to provide a link to the previous
> thread, because nobody can search for it .
>
> I'm +1 for having an open and shared slack space/channel for the
> contributors, but I think I would be -1 for such channels for the users.
>
> For users, I would prefer to focus more on, for example, stackoverflow.
> With upvoting, clever sorting of the answers (not the oldest/newest at top)
> it's easily searchable - those features make it fit our use case much
> better IMO.
>
> Best,
> Piotrek
>
>
>
> pt., 6 maj 2022 o 11:08 Xintong Song  napisał(a):
>
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > -- Forwarded message -
> > From: Xintong Song 
> > Date: Fri, May 6, 2022 at 5:07 PM
> > Subject: Re: [Discuss] Creating an Apache Flink slack workspace
> > To: private 
> > Cc: Chesnay Schepler 
> >
> >
> > Hi Chesnay,
> >
> > Correct me if I'm wrong, I don't find this is *repeatedly* discussed on
> the
> > ML. The only discussions I find are [1] & [2], which are 4 years ago. On
> > the other hand, I do find many users are asking questions about whether
> > Slack should be supported [2][3][4]. Besides, I also find a recent
> > discussion thread from ComDev [5], where alternative communication
> channels
> > are being discussed. It seems to me ASF is quite open to having such
> > additional channels and they have been worked well for many projects
> > already.
> >
> > I see two reasons for brining this discussion again:
> > 1. There are indeed many things that have change during the past 4 years.
> > We have more contributors, including committers and PMC members, and even
> > more users from various organizations and timezones. That also means more
> > discussions and Q&As are happening.
> > 2. The proposal here is different from the previous discussion. Instead
> of
> > maintaining a channel for Flink in the ASF workspace, here we are
> proposing
> > to create a dedicated Apache Flink slack workspace. And instead of
> *moving*
> > the discussion to Slack, we are proposing to add a Slack Workspace as an
> > addition to the ML.
> >
> > Below is your opinions that I found from your previous -1 [1]. IIUR,
> these
> > are all about the using ASF Slack Workspace. If I overlooked anything,
> > please let me know.
> >
> > > 1. According to INFRA-14292 <
> > > https://issues.apache.org/jira/browse/INFRA-14292> the ASF Slack isn't
> > > run by the ASF. This alone puts this service into rather questionable
> > > territory as it /looks/ like an official ASF service. If anyone can
> > provide
> > > information to the contrary, please do so.
> >
> > 2. We already discuss things on the mailing lists, JIRA and GitHub. All
> of
> > > these are available to the public, whereas the slack channel requires
> an
> > > @apache mail address, i.e. you have to be a committer. This minimizes
> the
> > > target audience rather significantly. I would much rather prefer
> > something
> > > that is also available to contributors.
> >
> >
> > I do agree this should be decided by the whole community. I'll forward
> this
> > to dev@ and user@ ML.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> > [1] https://lists.apache.org/thread/gxwv49ssq82g06dbhy339x6rdxtlcv3d
> > [2] https://lists.apache.org/thread/kcym1sozkrtwxw1fjbnwk1nqrrlzolcc
> > [3] https://lists.apache.org/thread/7rmd3ov6sv3wwhflp97n4czz25hvmqm6
> > [4] https://lists.apache.org/thread/n5y1kzv50bkkbl3ys494dglyxl45bmts
> > [5] https://lists.apache.org/thread/fzwd3lj0x53hkq3od5ot0y719dn3kj1j
> >
> > On Fri, May 6, 2022 at 3:05 PM Chesnay Schepler 
> > wrote:
> >
> > > This has been repeatedly discussed on the ML over the years and was
> > > rejected every time.
> > >
> > > I don't see that anything

Re: [Discuss] Creating an Apache Flink slack workspace

2022-05-06 Thread Becket Qin

Thanks for the proposal, Xintong.

While I share the same concerns as those mentioned in the previous
discussion thread, admittedly there are benefits of having a slack channel
as a supplementary way to discuss Flink. The fact that this topic is raised
once a while indicates lasting interests.

Personally I am open to having such a slack channel. Although it has
drawbacks, it serves a different purpose. I'd imagine that for people who
prefer instant messaging, in absence of the slack channel, a lot of
discussions might just take place offline today, which leaves no public
record at all.

One step further, if the channel is maintained by the Flink PMC, some kind
of code of conduct might be necessary. I think the suggestions of ad-hoc
conversations, reflecting back to the emails are good starting points. I
am +1 to give it a try and see how it goes. In the worst case, we can just
stop doing this and come back to where we are right now.

Thanks,

Jiangjie (Becket) Qin

On Fri, May 6, 2022 at 8:55 PM Martijn Visser  wrote:

> Hi everyone,
>
> While I see Slack having a major downside (the results are not indexed by
> external search engines, you can't link directly to Slack content unless
> you've signed up), I do think that the open source space has progressed and
> that Slack is considered as something that's invaluable to users. There are
> other Apache programs that also run it, like Apache Airflow [1]. I also see
> it as a potential option to create a more active community.
>
> A concern I can see is that users will start DMing well-known
> reviewers/committers to get a review or a PR merged. That can cause a lot
> of noise. I can go +1 for Slack, but then we need to establish a set of
> community rules.
>
> Best regards,
>
> Martijn
>
> [1] https://airflow.apache.org/community/
>
> On Fri, 6 May 2022 at 13:59, Piotr Nowojski  wrote:
>
>> Hi Xintong,
>>
>> I'm not sure if slack is the right tool for the job. IMO it works great as
>> an adhoc tool for discussion between developers, but it's not searchable
>> and it's not persistent. Between devs, it works fine, as long as the
>> result
>> of the ad hoc discussions is backported to JIRA/mailing list/design doc.
>> For users, that simply would be extremely difficult to achieve. In the
>> result, I would be afraid we are answering the same questions over, and
>> over and over again, without even a way to provide a link to the previous
>> thread, because nobody can search for it .
>>
>> I'm +1 for having an open and shared slack space/channel for the
>> contributors, but I think I would be -1 for such channels for the users.
>>
>> For users, I would prefer to focus more on, for example, stackoverflow.
>> With upvoting, clever sorting of the answers (not the oldest/newest at
>> top)
>> it's easily searchable - those features make it fit our use case much
>> better IMO.
>>
>> Best,
>> Piotrek
>>
>>
>>
>> pt., 6 maj 2022 o 11:08 Xintong Song  napisał(a):
>>
>> > Thank you~
>> >
>> > Xintong Song
>> >
>> >
>> >
>> > -- Forwarded message -
>> > From: Xintong Song 
>> > Date: Fri, May 6, 2022 at 5:07 PM
>> > Subject: Re: [Discuss] Creating an Apache Flink slack workspace
>> > To: private 
>> > Cc: Chesnay Schepler 
>> >
>> >
>> > Hi Chesnay,
>> >
>> > Correct me if I'm wrong, I don't find this is *repeatedly* discussed on
>> the
>> > ML. The only discussions I find are [1] & [2], which are 4 years ago. On
>> > the other hand, I do find many users are asking questions about whether
>> > Slack should be supported [2][3][4]. Besides, I also find a recent
>> > discussion thread from ComDev [5], where alternative communication
>> channels
>> > are being discussed. It seems to me ASF is quite open to having such
>> > additional channels and they have been worked well for many projects
>> > already.
>> >
>> > I see two reasons for brining this discussion again:
>> > 1. There are indeed many things that have change during the past 4
>> years.
>> > We have more contributors, including committers and PMC members, and
>> even
>> > more users from various organizations and timezones. That also means
>> more
>> > discussions and Q&As are happening.
>> > 2. The proposal here is different from the previous discussion. Instead
>> of
>> > maintaining a channel for Flink in the ASF workspace, here we are
>> proposing
>> > to create a dedicated Apache Flink slack workspace. And instead of
>> *moving*
>> > the discussion to Slack, we are proposing to add a Slack Workspace as an
>> > addition to the ML.
>> >
>> > Below is your opinions that I found from your previous -1 [1]. IIUR,
>> these
>> > are all about the using ASF Slack Workspace. If I overlooked anything,
>> > please let me know.
>> >
>> > > 1. According to INFRA-14292 <
>> > > https://issues.apache.org/jira/browse/INFRA-14292> the ASF Slack
>> isn't
>> > > run by the ASF. This alone puts this service into rather questionable
>> > > territory as it /looks/ like an official ASF service. If anyone can
>> > provide

Re: [DISCUSS] FLIP-220: Temporal State

2022-05-06 Thread Jingsong Li

+1 to generic interface for sorted state and Binary***State.

Very happy to be able to go one step further and thank you for your discussion.

Best,
Jingsong

On Fri, May 6, 2022 at 8:37 PM Nico Kruber  wrote:
>
> While working a bit more on this, David A and I noticed a couple of things
> that were not matching each other in the proposed APIs:
>
> a) the proposed BinarySortedMultiMapState class didn't actually have getters
> that return multiple items per key, and
> b) while having a single multi-map like implementation in the backend, with
> the adapted API, we'd like to put it up for discussion again whether we maybe
> want to have a user-facing BinarySortedMapState API as well which can be
> simpler but doesn't add any additional constraints to the state backends.
>
> Let me go into details a bit more:
> in a multi-map, a single key can be backed by a set of items and as such, the
> atomic unit that should be retrievable is not a single item but rather
> something like a Collection, an Iterable , a List, or so. Since we are already
> using Iterable in the main API, how about the following?
> ```
> public class BinarySortedMultiMapState extends State {
>   void put(UK key, Iterable values) throws Exception;
>   void add(UK key, UV value) throws Exception;
>
>   Iterable valueAt(UK key) throws Exception;
>
>   Map.Entry> firstEntry() throws Exception;
>   Map.Entry> lastEntry() throws Exception;
>
>   Iterable>> readRange(UK fromKey, UK toKey) throws
> Exception;
>   Iterable>> readRangeUntil(UK endUserKey) throws
> Exception;
>   Iterable>> readRangeFrom(UK startUserKey) throws
> Exception;
>
>   void clearRange(UK fromKey, UK toKey) throws Exception;
>   void clearRangeUntil(UK endUserKey) throws Exception;
>   void clearRangeFrom(UK startUserKey) throws Exception;
> }
> ```
>
> We also considered using Iterable> instead of Map.Entry Iterable>, but that wouldn't match well with firstEntry() and lastEntry()
> because for a single key, there is not a single first/last value. We also
> looked at common MultiMap insterfaces and their getters were also always
> retrieving the whole list/collection for a key. Since we don't want to promise
> too many details to the user, we believe, Iterable is our best choice for now
> but that can also be "upgraded" to, e.g., List in the future without breaking
> client code.
>
> An appropriate map-like version of that would be the following:
> ```
> public class BinarySortedMapState extends State {
>   void put(UK key, UV value) throws Exception;
>
>   UV valueAt(UK key) throws Exception;
>
>   Map.Entry firstEntry() throws Exception;
>   Map.Entry lastEntry() throws Exception;
>
>   Iterable> readRange(UK fromKey, UK toKey) throws
> Exception;
>   Iterable> readRangeUntil(UK endUserKey) throws Exception;
>   Iterable> readRangeFrom(UK startUserKey) throws Exception;
>
>   void clearRange(UK fromKey, UK toKey) throws Exception;
>   void clearRangeUntil(UK endUserKey) throws Exception;
>   void clearRangeFrom(UK startUserKey) throws Exception;
> }
> ```
>
>
> We believe, we were also missing details regarding the state descriptor and
> I'm still a bit fuzzy on what to provide as type T in StateDescriptor extends State, T>.
> For the constructors, however, since we'd require a
> LexicographicalTypeSerializer implementation, we would propose the following
> three overloads similar to the MapStateDescriptor:
> ```
> public class BinarySortedMultiMapStateDescriptor extends
> StateDescriptor, Map>/*?*/> {
>
> public BinarySortedMapStateDescriptor(
> String name, LexicographicalTypeSerializer keySerializer,
> TypeSerializer valueSerializer) {}
>
> public BinarySortedMapStateDescriptor(
> String name, LexicographicalTypeSerializer keySerializer,
> TypeInformation valueTypeInfo) {}
>
> public BinarySortedMapStateDescriptor(
> String name, LexicographicalTypeSerializer keySerializer,
> Class valueClass) {}
> }
> ```
> Technically, we could have a LexicographicalTypeInformation as well (for the
> 2nd overload) but don't currently see the need for that wrapper since these
> serializers are just needed for State - but maybe someone with more insights
> into this topic can advise.
>
>
> A few further points to to with respect to the implementation:
> - we'll have to find a suitable heap-based state backend implementation that
> integrates well with all current efforts (needs to be discussed)
> - the StateProcessor API will have to receive appropriate extensions to read
> and write this new form of state
>
>
>
> Nico
>
>
> On Friday, 29 April 2022 14:25:59 CEST Nico Kruber wrote:
> > Hi all,
> > Yun, David M, David A, and I had an offline discussion and talked through a
> > couple of details that emerged from the discussion here. We believe, we have
> > found a consensus on these points and would like to share our points for
> > further feedback:
> >
> > Let me try to get through the points that were opened in arbitrary order:
> >
> >
> >

Edit Permissions for Flink Connector Template

2022-05-06 Thread Ber, Jeremy

Hello,

I require Confluence Edit Permissions in order to create a Flink Connector 
Template page as discussed via e-mail.

Jeremy

[jira] [Created] (FLINK-27536) Rename method parameter in AsyncSinkWriter

2022-05-06 Thread Zichen Liu (Jira)

Zichen Liu created FLINK-27536:
--

 Summary: Rename method parameter in AsyncSinkWriter
 Key: FLINK-27536
 URL: https://issues.apache.org/jira/browse/FLINK-27536
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Common
Affects Versions: 1.15.0
Reporter: Zichen Liu


Change the abstract method's parameter naming in AsyncSinkWriter.

From

  Consumer> requestResult

to

  Consumer> requestToRetry

or something similar.

 

This is because the consumer here is supposed to accept a list of requests that 
need to be retried.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Re: [Discuss] Creating an Apache Flink slack workspace

2022-05-06 Thread David Anderson

I have mixed feelings about this.

I have been rather visible on stack overflow, and as a result I get a lot
of DMs asking for help. I enjoy helping, but want to do it on a platform
where the responses can be searched and shared.

It is currently the case that good questions on stack overflow frequently
go unanswered because no one with the necessary expertise takes the time to
respond. If the Flink community has the collective energy to do more user
outreach, more involvement on stack overflow would be a good place to
start. Adding slack as another way for users to request help from those who
are already actively providing support on the existing communication
channels might just lead to burnout.

On the other hand, there are rather rare, but very interesting cases where
considerable back and forth is needed to figure out what's going on. This
can happen, for example, when the requirements are unusual, or when a
difficult to diagnose bug is involved. In these circumstances, something
like slack is much better suited than email or stack overflow.

David

On Fri, May 6, 2022 at 3:04 PM Becket Qin  wrote:

> Thanks for the proposal, Xintong.
>
> While I share the same concerns as those mentioned in the previous
> discussion thread, admittedly there are benefits of having a slack channel
> as a supplementary way to discuss Flink. The fact that this topic is raised
> once a while indicates lasting interests.
>
> Personally I am open to having such a slack channel. Although it has
> drawbacks, it serves a different purpose. I'd imagine that for people who
> prefer instant messaging, in absence of the slack channel, a lot of
> discussions might just take place offline today, which leaves no public
> record at all.
>
> One step further, if the channel is maintained by the Flink PMC, some kind
> of code of conduct might be necessary. I think the suggestions of ad-hoc
> conversations, reflecting back to the emails are good starting points. I
> am +1 to give it a try and see how it goes. In the worst case, we can just
> stop doing this and come back to where we are right now.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Fri, May 6, 2022 at 8:55 PM Martijn Visser 
> wrote:
>
>> Hi everyone,
>>
>> While I see Slack having a major downside (the results are not indexed by
>> external search engines, you can't link directly to Slack content unless
>> you've signed up), I do think that the open source space has progressed and
>> that Slack is considered as something that's invaluable to users. There are
>> other Apache programs that also run it, like Apache Airflow [1]. I also see
>> it as a potential option to create a more active community.
>>
>> A concern I can see is that users will start DMing well-known
>> reviewers/committers to get a review or a PR merged. That can cause a lot
>> of noise. I can go +1 for Slack, but then we need to establish a set of
>> community rules.
>>
>> Best regards,
>>
>> Martijn
>>
>> [1] https://airflow.apache.org/community/
>>
>> On Fri, 6 May 2022 at 13:59, Piotr Nowojski  wrote:
>>
>>> Hi Xintong,
>>>
>>> I'm not sure if slack is the right tool for the job. IMO it works great
>>> as
>>> an adhoc tool for discussion between developers, but it's not searchable
>>> and it's not persistent. Between devs, it works fine, as long as the
>>> result
>>> of the ad hoc discussions is backported to JIRA/mailing list/design doc.
>>> For users, that simply would be extremely difficult to achieve. In the
>>> result, I would be afraid we are answering the same questions over, and
>>> over and over again, without even a way to provide a link to the previous
>>> thread, because nobody can search for it .
>>>
>>> I'm +1 for having an open and shared slack space/channel for the
>>> contributors, but I think I would be -1 for such channels for the users.
>>>
>>> For users, I would prefer to focus more on, for example, stackoverflow.
>>> With upvoting, clever sorting of the answers (not the oldest/newest at
>>> top)
>>> it's easily searchable - those features make it fit our use case much
>>> better IMO.
>>>
>>> Best,
>>> Piotrek
>>>
>>>
>>>
>>> pt., 6 maj 2022 o 11:08 Xintong Song  napisał(a):
>>>
>>> > Thank you~
>>> >
>>> > Xintong Song
>>> >
>>> >
>>> >
>>> > -- Forwarded message -
>>> > From: Xintong Song 
>>> > Date: Fri, May 6, 2022 at 5:07 PM
>>> > Subject: Re: [Discuss] Creating an Apache Flink slack workspace
>>> > To: private 
>>> > Cc: Chesnay Schepler 
>>> >
>>> >
>>> > Hi Chesnay,
>>> >
>>> > Correct me if I'm wrong, I don't find this is *repeatedly* discussed
>>> on the
>>> > ML. The only discussions I find are [1] & [2], which are 4 years ago.
>>> On
>>> > the other hand, I do find many users are asking questions about whether
>>> > Slack should be supported [2][3][4]. Besides, I also find a recent
>>> > discussion thread from ComDev [5], where alternative communication
>>> channels
>>> > are being discussed. It seems to me ASF is quite open to having such
>>> > additional ch

Re: [DISCUSS] FLIP-224: Blacklist Mechanism

2022-05-06 Thread Lijie Wang

Thanks for your feedback, Jiangang and Martijn.

@Jiangang


> For auto-detecting, I wonder how to make the strategy and mark a node
blocked?

In fact, we currently plan to not support auto-detection in this FLIP. The
part about auto-detection may be continued in a separate FLIP in the
future. Some guys have the same concerns as you, and the correctness and
necessity of auto-detection may require further discussion in the future.

> In session mode, multi jobs can fail on the same bad node and the node
should be marked blocked.
By design, the blocklist information will be shared among all jobs in a
cluster/session. The JM will sync blocklist information with RM.

@Martijn

> I agree with Yang Wang on this.
As Zhu Zhu and I mentioned above, we think the MARK_BLOCKLISTED(Just limits
the load of the node and does not  kill all the processes on it) is also
important, and we think that external systems (*yarn rmadmin or kubectl
taint*) cannot support it. So we think it makes sense even only *manually*.

> I also agree with Chesnay that magical mechanisms are indeed super hard
to get right.
Yes, as you see, Jiangang(and a few others) have the same concern.
However, we currently plan to not support auto-detection in this FLIP, and
only *manually*. In addition, I'd like to say that the FLIP provides a
mechanism to support MARK_BLOCKLISTED and MARK_BLOCKLISTED_AND_EVACUATE_TASKS,
the auto-detection may be done by external systems.

Best,
Lijie

Martijn Visser  于2022年5月6日周五 19:04写道：

> > If we only support to block nodes manually, then I could not see
> the obvious advantages compared with current SRE's approach(via *yarn
> rmadmin or kubectl taint*).
>
> I agree with Yang Wang on this.
>
> >  To me this sounds yet again like one of those magical mechanisms that
> will rarely work just right.
>
> I also agree with Chesnay that magical mechanisms are indeed super hard to
> get right.
>
> Best regards,
>
> Martijn
>
> On Fri, 6 May 2022 at 12:03, Jiangang Liu 
> wrote:
>
>> Thanks for the valuable design. The auto-detecting can decrease great work
>> for us. We have implemented the similar feature in our inner flink
>> version.
>> Below is something that I care about:
>>
>>1. For auto-detecting, I wonder how to make the strategy and mark a
>> node
>>blocked? Sometimes the blocked node is hard to be detected, for
>> example,
>>the upper node or the down node will be blocked when network
>> unreachable.
>>2. I see that the strategy is made in JobMaster side. How about
>>implementing the similar logic in resource manager? In session mode,
>> multi
>>jobs can fail on the same bad node and the node should be marked
>> blocked.
>>If the job makes the strategy, the node may be not marked blocked if
>> the
>>fail times don't exceed the threshold.
>>
>>
>> Zhu Zhu  于2022年5月5日周四 23:35写道：
>>
>> > Thank you for all your feedback!
>> >
>> > Besides the answers from Lijie, I'd like to share some of my thoughts:
>> > 1. Whether to enable automatical blocklist
>> > Generally speaking, it is not a goal of FLIP-224.
>> > The automatical way should be something built upon the blocklist
>> > mechanism and well decoupled. It was designed to be a configurable
>> > blocklist strategy, but I think we can further decouple it by
>> > introducing a abnormal node detector, as Becket suggested, which just
>> > uses the blocklist mechanism once bad nodes are detected. However, it
>> > should be a separate FLIP with further dev discussions and feedback
>> > from users. I also agree with Becket that different users have different
>> > requirements, and we should listen to them.
>> >
>> > 2. Is it enough to just take away abnormal nodes externally
>> > My answer is no. As Lijie has mentioned, we need a way to avoid
>> > deploying tasks to temporary hot nodes. In this case, users may just
>> > want to limit the load of the node and do not want to kill all the
>> > processes on it. Another case is the speculative execution[1] which
>> > may also leverage this feature to avoid starting mirror tasks on slow
>> > nodes.
>> >
>> > Thanks,
>> > Zhu
>> >
>> > [1]
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+execution+for+Batch+Job
>> >
>> > Lijie Wang  于2022年5月5日周四 15:56写道：
>> >
>> > >
>> > > Hi everyone,
>> > >
>> > >
>> > > Thanks for your feedback.
>> > >
>> > >
>> > > There's one detail that I'd like to re-emphasize here because it can
>> > affect the value and design of the blocklist mechanism (perhaps I should
>> > highlight it in the FLIP). We propose two actions in FLIP:
>> > >
>> > > 1) MARK_BLOCKLISTED: Just mark the task manager or node as blocked.
>> > Future slots should not be allocated from the blocked task manager or
>> node.
>> > But slots that are already allocated will not be affected. A typical
>> > application scenario is to mitigate machine hotspots. In this case, we
>> hope
>> > that subsequent resource allocations will not be on the hot machine, but
>> > tasks curren

Re: [ANNOUNCE] New Flink PMC member: Yang Wang

2022-05-06 Thread Yu Li

Congrats and welcome, Yang!

Best Regards,
Yu


On Fri, 6 May 2022 at 14:48, Paul Lam  wrote:

> Congrats, Yang! Well Deserved!
>
> Best,
> Paul Lam
>
> > 2022年5月6日 14:38，Yun Tang  写道：
> >
> > Congratulations, Yang!
> >
> > Best
> > Yun Tang
> > 
> > From: Jing Ge 
> > Sent: Friday, May 6, 2022 14:24
> > To: dev 
> > Subject: Re: [ANNOUNCE] New Flink PMC member: Yang Wang
> >
> > Congrats Yang and well Deserved!
> >
> > Best regards,
> > Jing
> >
> > On Fri, May 6, 2022 at 7:38 AM Lincoln Lee 
> wrote:
> >
> >> Congratulations Yang!
> >>
> >> Best,
> >> Lincoln Lee
> >>
> >>
> >> Őrhidi Mátyás  于2022年5月6日周五 12:46写道：
> >>
> >>> Congrats Yang! Well deserved!
> >>> Best,
> >>> Matyas
> >>>
> >>> On Fri, May 6, 2022 at 5:30 AM huweihua 
> wrote:
> >>>
>  Congratulations Yang!
> 
>  Best,
>  Weihua
> 
> 
> >>>
> >>
>
>

Re: [ANNOUNCE] New Flink PMC member: Yang Wang

2022-05-06 Thread Peter Huang

Congrats, Yang!



Best Regards
Peter Huang

On Fri, May 6, 2022 at 8:46 AM Yu Li  wrote:

> Congrats and welcome, Yang!
>
> Best Regards,
> Yu
>
>
> On Fri, 6 May 2022 at 14:48, Paul Lam  wrote:
>
> > Congrats, Yang! Well Deserved!
> >
> > Best,
> > Paul Lam
> >
> > > 2022年5月6日 14:38，Yun Tang  写道：
> > >
> > > Congratulations, Yang!
> > >
> > > Best
> > > Yun Tang
> > > 
> > > From: Jing Ge 
> > > Sent: Friday, May 6, 2022 14:24
> > > To: dev 
> > > Subject: Re: [ANNOUNCE] New Flink PMC member: Yang Wang
> > >
> > > Congrats Yang and well Deserved!
> > >
> > > Best regards,
> > > Jing
> > >
> > > On Fri, May 6, 2022 at 7:38 AM Lincoln Lee 
> > wrote:
> > >
> > >> Congratulations Yang!
> > >>
> > >> Best,
> > >> Lincoln Lee
> > >>
> > >>
> > >> Őrhidi Mátyás  于2022年5月6日周五 12:46写道：
> > >>
> > >>> Congrats Yang! Well deserved!
> > >>> Best,
> > >>> Matyas
> > >>>
> > >>> On Fri, May 6, 2022 at 5:30 AM huweihua 
> > wrote:
> > >>>
> >  Congratulations Yang!
> > 
> >  Best,
> >  Weihua
> > 
> > 
> > >>>
> > >>
> >
> >
>

[jira] [Created] (FLINK-27537) Remove requirement for Async Sink's RequestEntryT to be serializable

2022-05-06 Thread Zichen Liu (Jira)

Zichen Liu created FLINK-27537:
--

 Summary: Remove requirement for Async Sink's RequestEntryT to be 
serializable
 Key: FLINK-27537
 URL: https://issues.apache.org/jira/browse/FLINK-27537
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Common
Affects Versions: 1.15.0
Reporter: Zichen Liu


Currently, in `AsyncSinkBase` and it's dependent classes, e.g. the sink writer, 
element converter etc., the `RequestEntryT` generic type is required to be 
`serializable`.

However, this requirement no longer holds and there is nothing that actually 
requires this.

Proposed approach:

* Remove the `extends serializable` from the generic type `RequestEntryT`



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Re: [ANNOUNCE] New Flink PMC member: Yang Wang

2022-05-06 Thread David Morávek

Nice! Congrats Yang, well deserved! ;)

On Fri 6. 5. 2022 at 17:53, Peter Huang  wrote:

> Congrats, Yang!
>
>
>
> Best Regards
> Peter Huang
>
> On Fri, May 6, 2022 at 8:46 AM Yu Li  wrote:
>
> > Congrats and welcome, Yang!
> >
> > Best Regards,
> > Yu
> >
> >
> > On Fri, 6 May 2022 at 14:48, Paul Lam  wrote:
> >
> > > Congrats, Yang! Well Deserved!
> > >
> > > Best,
> > > Paul Lam
> > >
> > > > 2022年5月6日 14:38，Yun Tang  写道：
> > > >
> > > > Congratulations, Yang!
> > > >
> > > > Best
> > > > Yun Tang
> > > > 
> > > > From: Jing Ge 
> > > > Sent: Friday, May 6, 2022 14:24
> > > > To: dev 
> > > > Subject: Re: [ANNOUNCE] New Flink PMC member: Yang Wang
> > > >
> > > > Congrats Yang and well Deserved!
> > > >
> > > > Best regards,
> > > > Jing
> > > >
> > > > On Fri, May 6, 2022 at 7:38 AM Lincoln Lee 
> > > wrote:
> > > >
> > > >> Congratulations Yang!
> > > >>
> > > >> Best,
> > > >> Lincoln Lee
> > > >>
> > > >>
> > > >> Őrhidi Mátyás  于2022年5月6日周五 12:46写道：
> > > >>
> > > >>> Congrats Yang! Well deserved!
> > > >>> Best,
> > > >>> Matyas
> > > >>>
> > > >>> On Fri, May 6, 2022 at 5:30 AM huweihua 
> > > wrote:
> > > >>>
> > >  Congratulations Yang!
> > > 
> > >  Best,
> > >  Weihua
> > > 
> > > 
> > > >>>
> > > >>
> > >
> > >
> >
>

Re: [Discuss] Creating an Apache Flink slack workspace

2022-05-06 Thread Jingsong Li

Most of the open source communities I know have set up their slack
channels, such as Apache Iceberg [1], Apache Druid [2], etc.
So I think slack can be worth trying.

David is right, there are some cases that need to communicate back and
forth, slack communication will be more effective.

But back to the question, ultimately it's about whether there are
enough core developers willing to invest time in the slack, to
discuss, to answer questions, to communicate.
And whether there will be enough time to reply to the mailing list and
stackoverflow after we put in the slack (which we need to do).

[1] https://iceberg.apache.org/community/#slack
[2] https://druid.apache.org/community/

On Fri, May 6, 2022 at 10:06 PM David Anderson  wrote:
>
> I have mixed feelings about this.
>
> I have been rather visible on stack overflow, and as a result I get a lot of 
> DMs asking for help. I enjoy helping, but want to do it on a platform where 
> the responses can be searched and shared.
>
> It is currently the case that good questions on stack overflow frequently go 
> unanswered because no one with the necessary expertise takes the time to 
> respond. If the Flink community has the collective energy to do more user 
> outreach, more involvement on stack overflow would be a good place to start. 
> Adding slack as another way for users to request help from those who are 
> already actively providing support on the existing communication channels 
> might just lead to burnout.
>
> On the other hand, there are rather rare, but very interesting cases where 
> considerable back and forth is needed to figure out what's going on. This can 
> happen, for example, when the requirements are unusual, or when a difficult 
> to diagnose bug is involved. In these circumstances, something like slack is 
> much better suited than email or stack overflow.
>
> David
>
> On Fri, May 6, 2022 at 3:04 PM Becket Qin  wrote:
>>
>> Thanks for the proposal, Xintong.
>>
>> While I share the same concerns as those mentioned in the previous 
>> discussion thread, admittedly there are benefits of having a slack channel 
>> as a supplementary way to discuss Flink. The fact that this topic is raised 
>> once a while indicates lasting interests.
>>
>> Personally I am open to having such a slack channel. Although it has 
>> drawbacks, it serves a different purpose. I'd imagine that for people who 
>> prefer instant messaging, in absence of the slack channel, a lot of 
>> discussions might just take place offline today, which leaves no public 
>> record at all.
>>
>> One step further, if the channel is maintained by the Flink PMC, some kind 
>> of code of conduct might be necessary. I think the suggestions of ad-hoc 
>> conversations, reflecting back to the emails are good starting points. I am 
>> +1 to give it a try and see how it goes. In the worst case, we can just stop 
>> doing this and come back to where we are right now.
>>
>> Thanks,
>>
>> Jiangjie (Becket) Qin
>>
>> On Fri, May 6, 2022 at 8:55 PM Martijn Visser  wrote:
>>>
>>> Hi everyone,
>>>
>>> While I see Slack having a major downside (the results are not indexed by 
>>> external search engines, you can't link directly to Slack content unless 
>>> you've signed up), I do think that the open source space has progressed and 
>>> that Slack is considered as something that's invaluable to users. There are 
>>> other Apache programs that also run it, like Apache Airflow [1]. I also see 
>>> it as a potential option to create a more active community.
>>>
>>> A concern I can see is that users will start DMing well-known 
>>> reviewers/committers to get a review or a PR merged. That can cause a lot 
>>> of noise. I can go +1 for Slack, but then we need to establish a set of 
>>> community rules.
>>>
>>> Best regards,
>>>
>>> Martijn
>>>
>>> [1] https://airflow.apache.org/community/
>>>
>>> On Fri, 6 May 2022 at 13:59, Piotr Nowojski  wrote:

 Hi Xintong,

 I'm not sure if slack is the right tool for the job. IMO it works great as
 an adhoc tool for discussion between developers, but it's not searchable
 and it's not persistent. Between devs, it works fine, as long as the result
 of the ad hoc discussions is backported to JIRA/mailing list/design doc.
 For users, that simply would be extremely difficult to achieve. In the
 result, I would be afraid we are answering the same questions over, and
 over and over again, without even a way to provide a link to the previous
 thread, because nobody can search for it .

 I'm +1 for having an open and shared slack space/channel for the
 contributors, but I think I would be -1 for such channels for the users.

 For users, I would prefer to focus more on, for example, stackoverflow.
 With upvoting, clever sorting of the answers (not the oldest/newest at top)
 it's easily searchable - those features make it fit our use case much
 better IMO.

 Best,
 Piotrek

>>>

How to support C#/dotNet ?

2022-05-06 Thread Bruce Tian

Hi folks

Flink is a very  awesome project ! ,I want to develop C# SDK which is similar 
Java  python SDK . Is there  any protocol between flink with client SDK ?

Re: How to support C#/dotNet ?

2022-05-06 Thread Cristian Constantinescu

Not exactly what you asked for, but... Have a look at the Apache Beam
project.

Their goal is quite literally any language on any runner (including Flink).

They recently released an initial version in JavaScript "for fun" and it
took about two weeks to develop. To give you an idea of the effort it takes.

On Fri., May 6, 2022, 21:53 Bruce Tian,  wrote:

> Hi folks
>
> Flink is a very  awesome project ! ,I want to develop C# SDK which is
> similar Java  python SDK . Is there  any protocol between flink with client
> SDK ?
>
>

Re: Edit Permissions for Flink Connector Template

2022-05-06 Thread Xintong Song

Hi Jeremy,

Could you add a link to the previous discussion?

And you would need to first create an account at
https://cwiki.apache.org/confluence .

Thank you~

Xintong Song

On Fri, May 6, 2022 at 9:26 PM Ber, Jeremy  wrote:

> Hello,
>
> I require Confluence Edit Permissions in order to create a Flink Connector
> Template page as discussed via e-mail.
>
> Jeremy
>

Re: How to support C#/dotNet ?

2022-05-06 Thread Yun Tang

Hi Tian,

I am not sure whether you want to get such SDK or just want to refer some other 
existing repos to try to implement your own.
For the former purpose, you can find several non-official projects in 
github[1], which has not been tested widely.
If you just want to refer to some other existing repos, I could share some 
experience here. Several years ago, I participated in an open-source project 
Mobius[2], which offers API for C# on Apache Spark.
The kernel idea is to introduce adapter from C# to call java [3]. Though this 
project has been deprecated, you can refer to a still active project [4] for 
more information to see how to support C# API in Flink.

[1] https://github.com/HEF-Sharp/HEF.Flink
[2] https://github.com/microsoft/Mobius
[3] https://github.com/microsoft/Mobius/tree/master/csharp/Adapter
[4] https://github.com/dotnet/spark

Best
Yun Tang

From: Cristian Constantinescu 
Sent: Saturday, May 7, 2022 9:58
To: dev@flink.apache.org 
Subject: Re: How to support C#/dotNet ?

Not exactly what you asked for, but... Have a look at the Apache Beam
project.

Their goal is quite literally any language on any runner (including Flink).

They recently released an initial version in JavaScript "for fun" and it
took about two weeks to develop. To give you an idea of the effort it takes.

On Fri., May 6, 2022, 21:53 Bruce Tian,  wrote:

> Hi folks
>
> Flink is a very  awesome project ! ,I want to develop C# SDK which is
> similar Java  python SDK . Is there  any protocol between flink with client
> SDK ?
>
>

Re: Re: [ANNOUNCE] New Flink PMC member: Yang Wang

2022-05-06 Thread Yun Gao

Congratulations Yang!

Best,
Yun Gao



 --Original Mail --
Sender:David Morávek 
Send Date:Sat May 7 01:05:41 2022
Recipients:Dev 
Subject:Re: [ANNOUNCE] New Flink PMC member: Yang Wang
Nice! Congrats Yang, well deserved! ;)

On Fri 6. 5. 2022 at 17:53, Peter Huang  wrote:

> Congrats, Yang!
>
>
>
> Best Regards
> Peter Huang
>
> On Fri, May 6, 2022 at 8:46 AM Yu Li  wrote:
>
> > Congrats and welcome, Yang!
> >
> > Best Regards,
> > Yu
> >
> >
> > On Fri, 6 May 2022 at 14:48, Paul Lam  wrote:
> >
> > > Congrats, Yang! Well Deserved!
> > >
> > > Best,
> > > Paul Lam
> > >
> > > > 2022年5月6日 14:38，Yun Tang  写道：
> > > >
> > > > Congratulations, Yang!
> > > >
> > > > Best
> > > > Yun Tang
> > > > 
> > > > From: Jing Ge 
> > > > Sent: Friday, May 6, 2022 14:24
> > > > To: dev 
> > > > Subject: Re: [ANNOUNCE] New Flink PMC member: Yang Wang
> > > >
> > > > Congrats Yang and well Deserved!
> > > >
> > > > Best regards,
> > > > Jing
> > > >
> > > > On Fri, May 6, 2022 at 7:38 AM Lincoln Lee 
> > > wrote:
> > > >
> > > >> Congratulations Yang!
> > > >>
> > > >> Best,
> > > >> Lincoln Lee
> > > >>
> > > >>
> > > >> Őrhidi Mátyás  于2022年5月6日周五 12:46写道：
> > > >>
> > > >>> Congrats Yang! Well deserved!
> > > >>> Best,
> > > >>> Matyas
> > > >>>
> > > >>> On Fri, May 6, 2022 at 5:30 AM huweihua 
> > > wrote:
> > > >>>
> > >  Congratulations Yang!
> > > 
> > >  Best,
> > >  Weihua
> > > 
> > > 
> > > >>>
> > > >>
> > >
> > >
> >
>

[jira] [Created] (FLINK-27538) Change flink.version 1.15-SNAPSHOT to 1.15.0 in table store

2022-05-06 Thread Jingsong Lee (Jira)

Jingsong Lee created FLINK-27538:


 Summary: Change flink.version 1.15-SNAPSHOT to 1.15.0 in table 
store
 Key: FLINK-27538
 URL: https://issues.apache.org/jira/browse/FLINK-27538
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.2.0


* change flink.version
 * Use flink docker in E2eTestBase



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Re: [ANNOUNCE] Apache Flink 1.15.0 released

2022-05-06 Thread Yun Gao

Very thanks everyone for all the help on 
releasing 1.15.0!

And the doc links are now also fixed, thanks
Peter for pointing out the issues!

Best,
Yun Gao


--
From:Johannes Moser 
Send Time:2022 May 6 (Fri.) 17:42
To:dev 
Subject:Re: [ANNOUNCE] Apache Flink 1.15.0 released

Yes! 🎉

Thanks to the whole community.
All this involvement keeps impressing me.

> On 06.05.2022, at 01:42, Thomas Weise  wrote:
> 
> Thank you to all who contributed for making this release happen!
> 
> Thomas
> 
> On Thu, May 5, 2022 at 7:41 AM Zhu Zhu  wrote:
>> 
>> Thanks Yun, Till and Joe for the great work and thanks everyone who
>> makes this release possible!
>> 
>> Cheers,
>> Zhu
>> 
>> Jiangang Liu  于2022年5月5日周四 21:09写道：
>>> 
>>> Congratulations! This version is really helpful for us . We will explore it
>>> and help to improve it.
>>> 
>>> Best
>>> Jiangang Liu
>>> 
>>> Yu Li  于2022年5月5日周四 18:53写道：
>>> 
 Hurray!
 
 Thanks Yun Gao, Till and Joe for all the efforts as our release managers.
 And thanks all contributors for making this happen!
 
 Best Regards,
 Yu
 
 
 On Thu, 5 May 2022 at 18:01, Sergey Nuyanzin  wrote:
 
> Great news!
> Congratulations!
> Thanks to the release managers, and everyone involved.
> 
> On Thu, May 5, 2022 at 11:57 AM godfrey he  wrote:
> 
>> Congratulations~
>> 
>> Thanks Yun, Till and Joe for driving this release
>> and everyone who made this release happen.
>> 
>> Best,
>> Godfrey
>> 
>> Becket Qin  于2022年5月5日周四 17:39写道：
>>> 
>>> Hooray! Thanks Yun, Till and Joe for driving the release!
>>> 
>>> Cheers,
>>> 
>>> JIangjie (Becket) Qin
>>> 
>>> On Thu, May 5, 2022 at 5:20 PM Timo Walther 
> wrote:
>>> 
 It took a bit longer than usual. But I'm sure the users will love
> this
 release.
 
 Big thanks to the release managers!
 
 Timo
 
 Am 05.05.22 um 10:45 schrieb Yuan Mei:
> Great!
> 
> Thanks, Yun Gao, Till, and Joe for driving the release, and
 thanks
> to
> everyone for making this release happen!
> 
> Best
> Yuan
> 
> On Thu, May 5, 2022 at 4:40 PM Leonard Xu 
> wrote:
> 
>> Congratulations!
>> 
>> Thanks Yun Gao, Till and Joe for the great work as our release
>> manager
 and
>> everyone who involved.
>> 
>> Best,
>> Leonard
>> 
>> 
>> 
>>> 2022年5月5日 下午4:30，Yang Wang  写道：
>>> 
>>> Congratulations!
>>> 
>>> Thanks Yun Gao, Till and Joe for driving this release and
> everyone
>> who
>> made
>>> this release happen.
>> 
 
 
>> 
> 
> 
> --
> Best regards,
> Sergey
>

Re: Source alignment for Iceberg

2022-05-06 Thread Steven Wu

The conclusion of this discussion could be that we don't see much value in
leveraging FLIP-182 with Iceberg source. That would totally be fine.

For me, one big sticking point is the alignment timestamp for the (Iceberg)
source might be the same as the Flink application watermark.

On Thu, May 5, 2022 at 9:53 PM Piotr Nowojski 
wrote:

> Option 1 sounds reasonable but I would be tempted to wait for a second
> motivational use case before generalizing the framework. However I wouldn’t
> oppose this extension if others feel it’s useful and good thing to do
>
> Piotrek
>
> > Wiadomość napisana przez Becket Qin  w dniu
> 06.05.2022, o godz. 03:50:
> >
> > I think the key point here is essentially what information should Flink
> > expose to the user pluggables. Apparently split / local task watermark is
> > something many user pluggables would be interested in. Right now it is
> > calculated by the Flink framework but not exposed to the users space,
> i.e.
> > SourceReader / SplitEnumerator. So it looks at least we can offer this
> > information in some way so users can leverage that information to do
> > things.
> >
> > That said, I am not sure if this would help in the Iceberg alignment
> case.
> > Because at this point, FLIP-182 reports source reader watermarks
> > periodically, which may not align with the RequestSplitEvent. So if we
> > really want to leverage the FLIP-182 mechanism here, I see a few ways,
> just
> > to name two of them:
> > 1. we can expose the source reader watermark in the SourceReaderContext,
> so
> > the source readers can put the local watermark in a custom operator
> event.
> > This will effectively bypass the existing RequestSplitEvent. Or we can
> also
> > extend the RequestSplitEvent to add an additional info field of byte[]
> > type, so users can piggy-back additional information there, be it
> watermark
> > or other stuff.
> > 2. Simply piggy-back the local watermark in the RequestSplitEvent and
> pass
> > that info to the SplitEnumerator as well.
> >
> > If we are going to do this, personally I'd prefer the first way, as it
> > provides a mechanism to allow future extension. So it would be easier to
> > expose other framework information to the user space in the future.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> >
> >
> >> On Fri, May 6, 2022 at 6:15 AM Thomas Weise  wrote:
> >>
> >>> On Wed, May 4, 2022 at 11:03 AM Steven Wu 
> wrote:
> >>> Any opinion on different timestamp for source alignment (vs Flink
> >> application watermark)? For Iceberg source, we might want to enforce
> >> alignment on kafka timestamp but Flink application watermark may use
> event
> >> time field from payload.
> >>
> >> I imagine that more generally the question is alignment based on the
> >> iceberg partition/file metadata vs. individual rows? I think that
> >> should work as long as there is a guarantee for out of orderness
> >> within the split?
> >>
> >> Thomas
> >>
> >>>
> >>> Thanks,
> >>> Steven
> >>>
> >>> On Wed, May 4, 2022 at 7:02 AM Becket Qin 
> wrote:
> 
>  Hey Piotr,
> 
>  I think the mechanism FLIP-182 provided is a reasonable default one,
> >> which
>  ensures the watermarks are only drifted by an upper bound. However,
>  admittedly there are also other strategies for different purposes.
> 
>  In the Iceberg case, I am not sure if a static strictly allowed
> >> watermark
>  drift is desired. The source might just want to finish reading the
> >> assigned
>  splits as fast as possible. And it is OK to have a drift of "one
> split",
>  instead of a fixed time period.
> 
>  As another example, if there are some fast readers whose splits are
> >> always
>  throttled, while the other slow readers are struggling to keep up with
> >> the
>  rest of the splits, the split enumerator may decide to reassign the
> slow
>  splits so all the readers have something to read. This would need the
>  SplitEnumerator to be aware of the watermark progress on each reader.
> >> So it
>  seems useful to expose the WatermarkAlignmentEvent information to the
>  SplitEnumerator as well.
> 
>  Thanks,
> 
>  Jiangjie (Becket) Qin
> 
> 
> 
>  On Tue, May 3, 2022 at 7:58 PM Piotr Nowojski 
> >> wrote:
> 
> > Hi Steven,
> >
> > Isn't this redundant to FLIP-182 and FLIP-217? Can not Iceberg just
> >> emit
> > all splits and let FLIP-182/FLIP-217 handle the watermark alignment
> >> and
> > block the splits that are too much into the future? I can see this
> >> being an
> > issue if the existence of too many blocked splits is occupying too
> >> many
> > resources.
> >
> > If that's the case, indeed SourceCoordinator/SplitEnumerator would
> >> have to
> > decide on some basis how many and which splits to assign in what
> >> order. But
> > in that case I'm not sure how much you could use from FLIP-182 and
> > FLIP-217. They seem somehow orthogonal to m

Re: Source alignment for Iceberg

2022-05-06 Thread Steven Wu

might be the same as => might NOT be the same as

On Fri, May 6, 2022 at 8:13 PM Steven Wu  wrote:

> The conclusion of this discussion could be that we don't see much value in
> leveraging FLIP-182 with Iceberg source. That would totally be fine.
>
> For me, one big sticking point is the alignment timestamp for the
> (Iceberg) source might be the same as the Flink application watermark.
>
> On Thu, May 5, 2022 at 9:53 PM Piotr Nowojski 
> wrote:
>
>> Option 1 sounds reasonable but I would be tempted to wait for a second
>> motivational use case before generalizing the framework. However I wouldn’t
>> oppose this extension if others feel it’s useful and good thing to do
>>
>> Piotrek
>>
>> > Wiadomość napisana przez Becket Qin  w dniu
>> 06.05.2022, o godz. 03:50:
>> >
>> > I think the key point here is essentially what information should Flink
>> > expose to the user pluggables. Apparently split / local task watermark
>> is
>> > something many user pluggables would be interested in. Right now it is
>> > calculated by the Flink framework but not exposed to the users space,
>> i.e.
>> > SourceReader / SplitEnumerator. So it looks at least we can offer this
>> > information in some way so users can leverage that information to do
>> > things.
>> >
>> > That said, I am not sure if this would help in the Iceberg alignment
>> case.
>> > Because at this point, FLIP-182 reports source reader watermarks
>> > periodically, which may not align with the RequestSplitEvent. So if we
>> > really want to leverage the FLIP-182 mechanism here, I see a few ways,
>> just
>> > to name two of them:
>> > 1. we can expose the source reader watermark in the
>> SourceReaderContext, so
>> > the source readers can put the local watermark in a custom operator
>> event.
>> > This will effectively bypass the existing RequestSplitEvent. Or we can
>> also
>> > extend the RequestSplitEvent to add an additional info field of byte[]
>> > type, so users can piggy-back additional information there, be it
>> watermark
>> > or other stuff.
>> > 2. Simply piggy-back the local watermark in the RequestSplitEvent and
>> pass
>> > that info to the SplitEnumerator as well.
>> >
>> > If we are going to do this, personally I'd prefer the first way, as it
>> > provides a mechanism to allow future extension. So it would be easier to
>> > expose other framework information to the user space in the future.
>> >
>> > Thanks,
>> >
>> > Jiangjie (Becket) Qin
>> >
>> >
>> >
>> >> On Fri, May 6, 2022 at 6:15 AM Thomas Weise  wrote:
>> >>
>> >>> On Wed, May 4, 2022 at 11:03 AM Steven Wu 
>> wrote:
>> >>> Any opinion on different timestamp for source alignment (vs Flink
>> >> application watermark)? For Iceberg source, we might want to enforce
>> >> alignment on kafka timestamp but Flink application watermark may use
>> event
>> >> time field from payload.
>> >>
>> >> I imagine that more generally the question is alignment based on the
>> >> iceberg partition/file metadata vs. individual rows? I think that
>> >> should work as long as there is a guarantee for out of orderness
>> >> within the split?
>> >>
>> >> Thomas
>> >>
>> >>>
>> >>> Thanks,
>> >>> Steven
>> >>>
>> >>> On Wed, May 4, 2022 at 7:02 AM Becket Qin 
>> wrote:
>> 
>>  Hey Piotr,
>> 
>>  I think the mechanism FLIP-182 provided is a reasonable default one,
>> >> which
>>  ensures the watermarks are only drifted by an upper bound. However,
>>  admittedly there are also other strategies for different purposes.
>> 
>>  In the Iceberg case, I am not sure if a static strictly allowed
>> >> watermark
>>  drift is desired. The source might just want to finish reading the
>> >> assigned
>>  splits as fast as possible. And it is OK to have a drift of "one
>> split",
>>  instead of a fixed time period.
>> 
>>  As another example, if there are some fast readers whose splits are
>> >> always
>>  throttled, while the other slow readers are struggling to keep up
>> with
>> >> the
>>  rest of the splits, the split enumerator may decide to reassign the
>> slow
>>  splits so all the readers have something to read. This would need the
>>  SplitEnumerator to be aware of the watermark progress on each reader.
>> >> So it
>>  seems useful to expose the WatermarkAlignmentEvent information to the
>>  SplitEnumerator as well.
>> 
>>  Thanks,
>> 
>>  Jiangjie (Becket) Qin
>> 
>> 
>> 
>>  On Tue, May 3, 2022 at 7:58 PM Piotr Nowojski 
>> >> wrote:
>> 
>> > Hi Steven,
>> >
>> > Isn't this redundant to FLIP-182 and FLIP-217? Can not Iceberg just
>> >> emit
>> > all splits and let FLIP-182/FLIP-217 handle the watermark alignment
>> >> and
>> > block the splits that are too much into the future? I can see this
>> >> being an
>> > issue if the existence of too many blocked splits is occupying too
>> >> many
>> > resources.
>> >
>> > If that's the case, indeed SourceCoordinator/SplitEn

Re: [DISCUSS] FLIP-224: Blacklist Mechanism

2022-05-06 Thread Yang Wang

Thanks Lijie and ZhuZhu for the explanation.

I just overlooked the "MARK_BLOCKLISTED". For tasks level, it is indeed
some functionalities the external tools(e.g. kubectl taint) could not
support.


Best,
Yang

Lijie Wang  于2022年5月6日周五 22:18写道：

> Thanks for your feedback, Jiangang and Martijn.
>
> @Jiangang
>
>
> > For auto-detecting, I wonder how to make the strategy and mark a node
> blocked?
>
> In fact, we currently plan to not support auto-detection in this FLIP. The
> part about auto-detection may be continued in a separate FLIP in the
> future. Some guys have the same concerns as you, and the correctness and
> necessity of auto-detection may require further discussion in the future.
>
> > In session mode, multi jobs can fail on the same bad node and the node
> should be marked blocked.
> By design, the blocklist information will be shared among all jobs in a
> cluster/session. The JM will sync blocklist information with RM.
>
> @Martijn
>
> > I agree with Yang Wang on this.
> As Zhu Zhu and I mentioned above, we think the MARK_BLOCKLISTED(Just limits
> the load of the node and does not  kill all the processes on it) is also
> important, and we think that external systems (*yarn rmadmin or kubectl
> taint*) cannot support it. So we think it makes sense even only *manually*.
>
> > I also agree with Chesnay that magical mechanisms are indeed super hard
> to get right.
> Yes, as you see, Jiangang(and a few others) have the same concern.
> However, we currently plan to not support auto-detection in this FLIP, and
> only *manually*. In addition, I'd like to say that the FLIP provides a
> mechanism to support MARK_BLOCKLISTED and
> MARK_BLOCKLISTED_AND_EVACUATE_TASKS,
> the auto-detection may be done by external systems.
>
> Best,
> Lijie
>
> Martijn Visser  于2022年5月6日周五 19:04写道：
>
> > > If we only support to block nodes manually, then I could not see
> > the obvious advantages compared with current SRE's approach(via *yarn
> > rmadmin or kubectl taint*).
> >
> > I agree with Yang Wang on this.
> >
> > >  To me this sounds yet again like one of those magical mechanisms that
> > will rarely work just right.
> >
> > I also agree with Chesnay that magical mechanisms are indeed super hard
> to
> > get right.
> >
> > Best regards,
> >
> > Martijn
> >
> > On Fri, 6 May 2022 at 12:03, Jiangang Liu 
> > wrote:
> >
> >> Thanks for the valuable design. The auto-detecting can decrease great
> work
> >> for us. We have implemented the similar feature in our inner flink
> >> version.
> >> Below is something that I care about:
> >>
> >>1. For auto-detecting, I wonder how to make the strategy and mark a
> >> node
> >>blocked? Sometimes the blocked node is hard to be detected, for
> >> example,
> >>the upper node or the down node will be blocked when network
> >> unreachable.
> >>2. I see that the strategy is made in JobMaster side. How about
> >>implementing the similar logic in resource manager? In session mode,
> >> multi
> >>jobs can fail on the same bad node and the node should be marked
> >> blocked.
> >>If the job makes the strategy, the node may be not marked blocked if
> >> the
> >>fail times don't exceed the threshold.
> >>
> >>
> >> Zhu Zhu  于2022年5月5日周四 23:35写道：
> >>
> >> > Thank you for all your feedback!
> >> >
> >> > Besides the answers from Lijie, I'd like to share some of my thoughts:
> >> > 1. Whether to enable automatical blocklist
> >> > Generally speaking, it is not a goal of FLIP-224.
> >> > The automatical way should be something built upon the blocklist
> >> > mechanism and well decoupled. It was designed to be a configurable
> >> > blocklist strategy, but I think we can further decouple it by
> >> > introducing a abnormal node detector, as Becket suggested, which just
> >> > uses the blocklist mechanism once bad nodes are detected. However, it
> >> > should be a separate FLIP with further dev discussions and feedback
> >> > from users. I also agree with Becket that different users have
> different
> >> > requirements, and we should listen to them.
> >> >
> >> > 2. Is it enough to just take away abnormal nodes externally
> >> > My answer is no. As Lijie has mentioned, we need a way to avoid
> >> > deploying tasks to temporary hot nodes. In this case, users may just
> >> > want to limit the load of the node and do not want to kill all the
> >> > processes on it. Another case is the speculative execution[1] which
> >> > may also leverage this feature to avoid starting mirror tasks on slow
> >> > nodes.
> >> >
> >> > Thanks,
> >> > Zhu
> >> >
> >> > [1]
> >> >
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+execution+for+Batch+Job
> >> >
> >> > Lijie Wang  于2022年5月5日周四 15:56写道：
> >> >
> >> > >
> >> > > Hi everyone,
> >> > >
> >> > >
> >> > > Thanks for your feedback.
> >> > >
> >> > >
> >> > > There's one detail that I'd like to re-emphasize here because it can
> >> > affect the value and design of the blocklist mechanism (perhaps I
> s

[DISCUSS ] HybridSouce Table & Sql api timeline

2022-05-06 Thread Ran Tao

HybridSource is a good feature, but now release version did not support
table & sql api, i wonder which time it will be ready to end-users.

And i have implemented a inner version of my company and it works well now.
the implementation of table & sql api may involve some core questions, e.g.
bounded & unbounded source start and end offset. child sources's schema is
different from hybrid source ddl schema (batch or streaming is more fields
or lack of some fields) we need process inconsistent filed problem or
default filed value problem (in some child source lack of some fields) etc.
so here we may need a field mapping.

i have some ideas and implementations, if table & sql api work in progress
i'm glad to share or take part in developing.

thanks~

Re: [DISCUSS] FLIP-91: Support SQL Client Gateway

2022-05-06 Thread Echo Lee

Hi Shengkai,

Thanks for driving the proposal, I've been paying attention to the 
ververica/flink-sql-gateway[1] project recently. Because the Flink version the 
company is currently using is 1.14, so try to upgrade the gateway to 
Flink-1.14, because of the changes in flink-table api makes the upgrade very 
painful. At the same time, it is also found that flink-sql-gateway and 
flink-sql-client have many similarities in design. So at present, I am more 
concerned about whether flink-sql-gateway should be independent or retained in 
the Flink project similar to sql-client.
In addition, I am also very interested in flink-sql-gateway's support for OLAP. 
Recently, when I tried to upgrade flink-sql-gateway, I was confused about the 
query statement to get the result. There are mainly the following issues:
1. For stream processing, what's the point of getting the result? Is it just 
for debugging and how to unify with batch processing
2. For batch processing, does the gateway need to cache all fetch results?
3. Whether executing query and fetch results should be synchronous or 
asynchronous?
4. When executing query in flink-sql-client, I often find error logs of 
FlinkJobNotFoundException. Should this be optimized?

[1] https://github.com/ververica/flink-sql-gateway

Best,
Echo Lee

> 在 2022年5月6日，下午5:05，LuNing Wang  写道：
> 
> Thanks, Shengkai for driving.  And all for your discussion.
> 
> 
> 
>> intergate the Gateway into the Flink code base
> 
> After I talk with Shengkai offline and read the topic `Powering HTAP at
> ByteDance with Apache Flink` of Flink Forward Asia. I think it is better to
> integrate Gateway code into the Flink codebase.
> 
> 
> In the future, we can add a feature that merges SQL gateway into
> JobManager. We can request JobManager API to directly submit the Flink SQL
> job. It will further improve the performance of Flink OLAP.  In the future,
> the Flink must be a unified engine for batch, stream, and OLAP. The
> Presto/Trino directly requests the master node to submit a job, if so, we
> can reduce Q&M in Flink session mode. Perhaps, the Flink application mode
> can’t merge SQL gateway into JobManager, but Flink OLAP almost always uses
> session mode.
> 
>> Gateway to support multiple Flink versions
> 
> 
> If we will merge the SQL gateway into JobManager, the SQL Gateway itself
> can adapt only one Flink version. We could import a Network Gateway to
> redirect requests to Gateway or JobManager of various versions. Perhaps,
> the network gateway uses other projects, like Apache Kyuubi or Zeppelin,
> etc.
> 
>> I don't think that the Gateway is a 'core' function of Flink which should
> 
> be included with Flink.
> 
> In the production environment, Flink SQL always uses a Gateway. This point
> can be observed in the user email lists and some Flink Forward topics. The
> SQL Gateway is an important infrastructure for big data compute engine. As
> the Flink has not it, many Flink users achieve SQL Gateway in the Apache
> Kyuubi project, but it should be the work of official Flink.
> 
>> I think it's fine to move this functionlity to the client rather than
> 
> gateway. WDYT?
> 
> I agree with the `init-file` option in the client. I think the `init-file`
> functionality in Gateway is NOT important in the first version of Gateway.
> Now, the hive JDBC option ‘initFile’ already has this functionality. After
> SQL Gateway releases and we observe feedback from the community, we maybe
> will discuss this problem again.
> 
> Best,
> 
> LuNing Wang.
> 
> 
> Shengkai Fang  于2022年5月6日周五 14:34写道：
> 
>> Thanks Martijn, Nicholas, Godfrey, Jark and Jingsong feedback
>> 
>>> I would like to understand why it's complicated to make the upgrades
>>> problematic
>> 
>> I aggree with Jark's point. The API is not very stable in the Flink
>> actually. For example, the Gateway relies on the planner. But in
>> release-1.14 Flink renames the blink planner package. In release-1.15 Flink
>> makes the planner scala free, which means other projects should not
>> directly rely on the planner.
>> 
>>> Does the Flink SQL gateway support submitting a batch job?
>> 
>> Of course. In the SQL Gateway, you can just use the sql SET
>> 'execution.runtime-mode' = 'batch' to switch to the batch environment. Then
>> the job you submit later will be executed in the batch mode.
>> 
>>> The architecture of the Gateway is in the following graph.
>> Is the TableEnvironment shared for all sessions ?
>> 
>> No. Every session has its individual TableEnvironment. I have modified the
>> graph to make everything more clear.
>> 
>>> /v1/sessions
 Are both local file and remote file supported for `libs` and `jars`?
>> 
>> We don't limit the usage here. But I think we will only support the local
>> file in the next version.
>> 
 Does sql gateway support upload files?
>> 
>> No. We need a new API to do this. We can collect more user feedback to
>> determine whether we need to implement this feature.
>> 
>>> /v1/sessions/:session_handle/co

Re: [DISCUSS] FLIP-229: Introduces Join Hint for Flink SQL Batch Job

2022-05-06 Thread Jark Wu

Hi Xuyang,

Thanks for starting this discussion. Join Hint is a long-time requested
feature.
I have briefly gone through the design doc. Join Hint is a public API for
SQL syntax.
It should work for both streaming and batch SQL. I understand some special
hints
may only work for batch SQL. Could you demonstrate how the hints affect
stream SQL as well?

Besides that, could you move your design docs into the wiki?
Google docs are usually used for offline discussion.
The discussion on google docs is not very visible to the community.
So we would like to move designs to the wiki and move discussions to the
mailing list.

Best,
Jark

On Fri, 6 May 2022 at 11:07, Xuyang  wrote:

> Hi, all.
> I want to start a discussion about the FLIP-229: Introduces Join Hint
> for Flink SQL Batch Job(The cwiki[1] is not ready completely but you can
> see the whole details in docs[2]).
> Join Hint is a common solution in many popular computing engines and DBs
> to improve the shortcomings of the optimizer by intervening in optimizing
> the plan. By Join Hint, users can intervene in the selection of the join
> strategy in optimizer, and manually optimize the execution plan to improve
> the performance of the query.
> In this FLIP, we propose some join hints by the existing join
> strategies in Flink SQL for Batch job.
> I'm look forward to your feedback about FLIP-229.
>
>
>
>
> --
>
> Best！
> Xuyang
>
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-229%3A+Introduces+Join+Hint+for+Flink+SQL+Batch+Job
> [2]
> https://docs.google.com/document/d/1IL00ME0Z0nlXGDWTUPODobVQMAm94PAPr9pw9EdGkoQ/edit?usp=sharing

Re: Re: [ANNOUNCE] New Flink PMC member: Yang Wang

2022-05-06 Thread Jacky Lau

Congrats Yang and well Deserved!

Best,
Jacky Lau

Yun Gao  于2022年5月7日周六 10:44写道：

> Congratulations Yang!
>
> Best,
> Yun Gao
>
>
>
>  --Original Mail --
> Sender:David Morávek 
> Send Date:Sat May 7 01:05:41 2022
> Recipients:Dev 
> Subject:Re: [ANNOUNCE] New Flink PMC member: Yang Wang
> Nice! Congrats Yang, well deserved! ;)
>
> On Fri 6. 5. 2022 at 17:53, Peter Huang 
> wrote:
>
> > Congrats, Yang!
> >
> >
> >
> > Best Regards
> > Peter Huang
> >
> > On Fri, May 6, 2022 at 8:46 AM Yu Li  wrote:
> >
> > > Congrats and welcome, Yang!
> > >
> > > Best Regards,
> > > Yu
> > >
> > >
> > > On Fri, 6 May 2022 at 14:48, Paul Lam  wrote:
> > >
> > > > Congrats, Yang! Well Deserved!
> > > >
> > > > Best,
> > > > Paul Lam
> > > >
> > > > > 2022年5月6日 14:38，Yun Tang  写道：
> > > > >
> > > > > Congratulations, Yang!
> > > > >
> > > > > Best
> > > > > Yun Tang
> > > > > 
> > > > > From: Jing Ge 
> > > > > Sent: Friday, May 6, 2022 14:24
> > > > > To: dev 
> > > > > Subject: Re: [ANNOUNCE] New Flink PMC member: Yang Wang
> > > > >
> > > > > Congrats Yang and well Deserved!
> > > > >
> > > > > Best regards,
> > > > > Jing
> > > > >
> > > > > On Fri, May 6, 2022 at 7:38 AM Lincoln Lee  >
> > > > wrote:
> > > > >
> > > > >> Congratulations Yang!
> > > > >>
> > > > >> Best,
> > > > >> Lincoln Lee
> > > > >>
> > > > >>
> > > > >> Őrhidi Mátyás  于2022年5月6日周五 12:46写道：
> > > > >>
> > > > >>> Congrats Yang! Well deserved!
> > > > >>> Best,
> > > > >>> Matyas
> > > > >>>
> > > > >>> On Fri, May 6, 2022 at 5:30 AM huweihua 
> > > > wrote:
> > > > >>>
> > > >  Congratulations Yang!
> > > > 
> > > >  Best,
> > > >  Weihua
> > > > 
> > > > 
> > > > >>>
> > > > >>
> > > >
> > > >
> > >
> >
>

Re: [DISCUSS ] HybridSouce Table & Sql api timeline

2022-05-06 Thread Jacky Lau

I am interested in this, can you share your DDL first? At the same time,
there is no standard to describe hybrid Source DDL.
 I also hope that the community can discuss together and share their ideas.

Ran Tao  于2022年5月7日周六 11:59写道：

> HybridSource is a good feature, but now release version did not support
> table & sql api, i wonder which time it will be ready to end-users.
>
> And i have implemented a inner version of my company and it works well now.
> the implementation of table & sql api may involve some core questions, e.g.
> bounded & unbounded source start and end offset. child sources's schema is
> different from hybrid source ddl schema (batch or streaming is more fields
> or lack of some fields) we need process inconsistent filed problem or
> default filed value problem (in some child source lack of some fields) etc.
> so here we may need a field mapping.
>
> i have some ideas and implementations, if table & sql api work in progress
> i'm glad to share or take part in developing.
>
> thanks~
>

flink Job is throwing depdnecy issue when submitted to clusrer

2022-05-06 Thread Great Info

I have one flink job which reads files from s3 and processes them.
Currently, it is running on flink 1.9.0, I need to upgrade my cluster to
1.13.5, so I have done the changes in my job pom and brought up the flink
cluster using 1.13.5 dist.

when I submit my application I am getting the below error when it tries to
connect to s3, have updated the s3 SDK version to the latest, but still
getting the same error.

caused by: java.lang.invoke.lambdaconversionexception: invalid receiver
type interface org.apache.http.header; not a subtype of implementation type
interface org.apache.http.namevaluepair

it works when I just run as a mini-cluster ( running just java -jar
) and also when I submit to the Flink cluster with 1.9.0.

Not able to understand where the dependency match is happening.

Re: flink Job is throwing depdnecy issue when submitted to clusrer

2022-05-06 Thread 张立志

退订



| |
zh_ha...@163.com
|
|
邮箱：zh_ha...@163.com
|




 回复的原邮件 
| 发件人 | Great Info |
| 日期 | 2022年05月07日 13:21 |
| 收件人 | dev@flink.apache.org、user |
| 抄送至 | |
| 主题 | flink Job is throwing depdnecy issue when submitted to clusrer |
I have one flink job which reads files from s3 and processes them.
Currently, it is running on flink 1.9.0, I need to upgrade my cluster to
1.13.5, so I have done the changes in my job pom and brought up the flink
cluster using 1.13.5 dist.

when I submit my application I am getting the below error when it tries to
connect to s3, have updated the s3 SDK version to the latest, but still
getting the same error.

caused by: java.lang.invoke.lambdaconversionexception: invalid receiver
type interface org.apache.http.header; not a subtype of implementation type
interface org.apache.http.namevaluepair

it works when I just run as a mini-cluster ( running just java -jar
) and also when I submit to the Flink cluster with 1.9.0.

Not able to understand where the dependency match is happening.

Re: Source alignment for Iceberg

2022-05-06 Thread Becket Qin

Hey Steven,

Your conclusion at this point sounds reasonable to me. That being said, I
think we need to consider a bit more about the extensibility of Flink in
the future. I would be happy to drive some efforts in that direction. So
later on, the timestamp alignment of Iceberg may be able to leverage some
of the capabilities in the framework.

Thanks again for the detailed discussion!

Cheers,

Jiangjie (Becket) Qin

On Sat, May 7, 2022 at 11:15 AM Steven Wu  wrote:

> might be the same as => might NOT be the same as
>
> On Fri, May 6, 2022 at 8:13 PM Steven Wu  wrote:
>
> > The conclusion of this discussion could be that we don't see much value
> in
> > leveraging FLIP-182 with Iceberg source. That would totally be fine.
> >
> > For me, one big sticking point is the alignment timestamp for the
> > (Iceberg) source might be the same as the Flink application watermark.
> >
> > On Thu, May 5, 2022 at 9:53 PM Piotr Nowojski 
> > wrote:
> >
> >> Option 1 sounds reasonable but I would be tempted to wait for a second
> >> motivational use case before generalizing the framework. However I
> wouldn’t
> >> oppose this extension if others feel it’s useful and good thing to do
> >>
> >> Piotrek
> >>
> >> > Wiadomość napisana przez Becket Qin  w dniu
> >> 06.05.2022, o godz. 03:50:
> >> >
> >> > I think the key point here is essentially what information should
> Flink
> >> > expose to the user pluggables. Apparently split / local task watermark
> >> is
> >> > something many user pluggables would be interested in. Right now it is
> >> > calculated by the Flink framework but not exposed to the users space,
> >> i.e.
> >> > SourceReader / SplitEnumerator. So it looks at least we can offer this
> >> > information in some way so users can leverage that information to do
> >> > things.
> >> >
> >> > That said, I am not sure if this would help in the Iceberg alignment
> >> case.
> >> > Because at this point, FLIP-182 reports source reader watermarks
> >> > periodically, which may not align with the RequestSplitEvent. So if we
> >> > really want to leverage the FLIP-182 mechanism here, I see a few ways,
> >> just
> >> > to name two of them:
> >> > 1. we can expose the source reader watermark in the
> >> SourceReaderContext, so
> >> > the source readers can put the local watermark in a custom operator
> >> event.
> >> > This will effectively bypass the existing RequestSplitEvent. Or we can
> >> also
> >> > extend the RequestSplitEvent to add an additional info field of byte[]
> >> > type, so users can piggy-back additional information there, be it
> >> watermark
> >> > or other stuff.
> >> > 2. Simply piggy-back the local watermark in the RequestSplitEvent and
> >> pass
> >> > that info to the SplitEnumerator as well.
> >> >
> >> > If we are going to do this, personally I'd prefer the first way, as it
> >> > provides a mechanism to allow future extension. So it would be easier
> to
> >> > expose other framework information to the user space in the future.
> >> >
> >> > Thanks,
> >> >
> >> > Jiangjie (Becket) Qin
> >> >
> >> >
> >> >
> >> >> On Fri, May 6, 2022 at 6:15 AM Thomas Weise  wrote:
> >> >>
> >> >>> On Wed, May 4, 2022 at 11:03 AM Steven Wu 
> >> wrote:
> >> >>> Any opinion on different timestamp for source alignment (vs Flink
> >> >> application watermark)? For Iceberg source, we might want to enforce
> >> >> alignment on kafka timestamp but Flink application watermark may use
> >> event
> >> >> time field from payload.
> >> >>
> >> >> I imagine that more generally the question is alignment based on the
> >> >> iceberg partition/file metadata vs. individual rows? I think that
> >> >> should work as long as there is a guarantee for out of orderness
> >> >> within the split?
> >> >>
> >> >> Thomas
> >> >>
> >> >>>
> >> >>> Thanks,
> >> >>> Steven
> >> >>>
> >> >>> On Wed, May 4, 2022 at 7:02 AM Becket Qin 
> >> wrote:
> >> 
> >>  Hey Piotr,
> >> 
> >>  I think the mechanism FLIP-182 provided is a reasonable default
> one,
> >> >> which
> >>  ensures the watermarks are only drifted by an upper bound. However,
> >>  admittedly there are also other strategies for different purposes.
> >> 
> >>  In the Iceberg case, I am not sure if a static strictly allowed
> >> >> watermark
> >>  drift is desired. The source might just want to finish reading the
> >> >> assigned
> >>  splits as fast as possible. And it is OK to have a drift of "one
> >> split",
> >>  instead of a fixed time period.
> >> 
> >>  As another example, if there are some fast readers whose splits are
> >> >> always
> >>  throttled, while the other slow readers are struggling to keep up
> >> with
> >> >> the
> >>  rest of the splits, the split enumerator may decide to reassign the
> >> slow
> >>  splits so all the readers have something to read. This would need
> the
> >>  SplitEnumerator to be aware of the watermark progress on each
> reader.
> >> >> So it
> >>  seems useful to expose the

Re: Edit Permissions for Flink Connector Template

2022-05-06 Thread Martijn Visser

Hi all,

For context, I've had an offline discussion with Jeremy on what's needed to
propose a new Flink connector. That's why there is a need to create a FLIP.
His Confluence user name is jeremyber

Best regards,

Martijn

Op za 7 mei 2022 om 04:18 schreef Xintong Song 

> Hi Jeremy,
>
> Could you add a link to the previous discussion?
>
> And you would need to first create an account at
> https://cwiki.apache.org/confluence .
>
> Thank you~
>
> Xintong Song
>
>
>
> On Fri, May 6, 2022 at 9:26 PM Ber, Jeremy 
> wrote:
>
> > Hello,
> >
> > I require Confluence Edit Permissions in order to create a Flink
> Connector
> > Template page as discussed via e-mail.
> >
> > Jeremy
> >
>
-- 
---
Martijn Visser
https://twitter.com/MartijnVisser82
https://github.com/MartijnVisser

64 matches

Mail list logo