Apply for Contributor Permission

2022-05-22 Thread DavidLiu
Hi Guys, 
I want to contribute to Apache Flink. 
Would you please give me the permission as a contributor? 
My JIRA ID is DavidLiu001




Thanks & BR
DavidLiu



Apply for Contributor Permission

2022-05-22 Thread DavidLiu
Hi Guys, 
I want to contribute to Apache Flink. 
Would you please give me the permission as a contributor? 
My JIRA ID is DavidLiu001




Thanks & BR
DavidLiu



[jira] [Created] (FLINK-27732) [JUnit5 Migration] Module: flink-examples-table

2022-05-22 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-27732:
---

 Summary: [JUnit5 Migration] Module: flink-examples-table
 Key: FLINK-27732
 URL: https://issues.apache.org/jira/browse/FLINK-27732
 Project: Flink
  Issue Type: Bug
Reporter: Sergey Nuyanzin






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-27733) Rework on_timer output behind watermark bug fix

2022-05-22 Thread Juntao Hu (Jira)
Juntao Hu created FLINK-27733:
-

 Summary: Rework on_timer output behind watermark bug fix
 Key: FLINK-27733
 URL: https://issues.apache.org/jira/browse/FLINK-27733
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Affects Versions: 1.15.0
Reporter: Juntao Hu
 Fix For: 1.16.0


FLINK-27676 can be simplified by just checking isBundleFinished() before 
emitting watermark in AbstractPythonFunctionOperator, and this fix FLINK-27676 
in python group window aggregate too.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-27734) Not showing checkpoint interval properly in WebUI when checkpoint is disabled

2022-05-22 Thread Feifan Wang (Jira)
Feifan Wang created FLINK-27734:
---

 Summary: Not showing checkpoint interval properly  in WebUI when 
checkpoint is disabled
 Key: FLINK-27734
 URL: https://issues.apache.org/jira/browse/FLINK-27734
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Web Frontend
Reporter: Feifan Wang
 Attachments: image-2022-05-22-23-42-46-365.png

Not showing checkpoint interval properly  in WebUI when checkpoint is disabled

!image-2022-05-22-23-42-46-365.png|width=1019,height=362!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-27735) Update testcontainers dependency to v1.17.2

2022-05-22 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-27735:
---

 Summary: Update testcontainers dependency to v1.17.2
 Key: FLINK-27735
 URL: https://issues.apache.org/jira/browse/FLINK-27735
 Project: Flink
  Issue Type: Technical Debt
  Components: Test Infrastructure, Tests
Reporter: Sergey Nuyanzin


testcontainers 1.17.2 is released

Among others there is a fix for connection leak in jdbc, performance
Main benefits (based on 
[https://github.com/testcontainers/testcontainers-java/releases/tag/1.17.2)]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [VOTE] FLIP-91: Support SQL Gateway

2022-05-22 Thread Jingsong Li
+1

Best,
Jingsong

On Sat, May 21, 2022 at 12:23 AM Shqiprim Bunjaku 
wrote:

> +1 (non-binding)
>
> Best,
> Shqiprim Bunjaku
>
> On Fri, May 20, 2022 at 5:56 PM Yufei Zhang  wrote:
>
> > +1 (non-binding)
> >
> > Best,
> > Yufei Zhang
> >
> > On Fri, May 20, 2022 at 11:24 AM Paul Lam  wrote:
> >
> > > +1 (non-binding)
> > >
> > > Best,
> > > Paul Lam
> > >
> > > > 2022年5月20日 10:48,Jark Wu  写道:
> > > >
> > > > +1 (binding)
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > > On Fri, 20 May 2022 at 10:39, Shengkai Fang 
> wrote:
> > > >
> > > >> Hi, everyone.
> > > >>
> > > >> Thanks for your feedback for FLIP-91: Support SQL Gateway[1] on the
> > > >> discussion thread[2]. I'd like to start a vote for it. The vote will
> > be
> > > >> open for at least 72 hours unless there is an objection or not
> enough
> > > >> votes.
> > > >>
> > > >> Best,
> > > >> Shengkai
> > > >>
> > > >> [1]
> > > >>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Gateway
> > > >> [2]https://lists.apache.org/thread/gr7soo29z884r1scnz77r2hwr2xmd9b0
> > > >>
> > >
> > >
> >
>


Re:Apply for Contributor Permission

2022-05-22 Thread Xuyang
Hi, everyone can reply the issue but contributors also could not assign the 
issue to themselves. You can reply to the discussion in the issue where you 
what to fix the bug, and if a committer sees it, he will assign to you. 




--

Best!
Xuyang





At 2022-05-22 18:58:02, "DavidLiu"  wrote:
>Hi Guys, 
>I want to contribute to Apache Flink. 
>Would you please give me the permission as a contributor? 
>My JIRA ID is DavidLiu001
>
>
>
>
>Thanks & BR
>DavidLiu
>


Application mode -yarn dependancy error

2022-05-22 Thread Zain Haider Nemati
Hi,
I'm getting this error in yarn application mode when submitting my job.

Caused by: java.lang.ClassCastException: cannot assign instance of
org.apache.commons.collections.map.LinkedMap to field
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.pendingOffsetsToCommit
of type org.apache.commons.collections.map.LinkedMap in instance of
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer
at
java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2301)
~[?:1.8.0_332]
at
java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1431)
~[?:1.8.0_332]
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2437)
~[?:1.8.0_332]
at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
~[?:1.8.0_332]
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
~[?:1.8.0_332]
at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
~[?:1.8.0_332]
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
~[?:1.8.0_332]
at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
~[?:1.8.0_332]
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
~[?:1.8.0_332]
at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
~[?:1.8.0_332]
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
~[?:1.8.0_332]
at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
~[?:1.8.0_332]
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
~[?:1.8.0_332]
at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
~[?:1.8.0_332]
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:503)
~[?:1.8.0_332]
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:461)
~[?:1.8.0_332]
at
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:615)
~[streamingjobs-1.13.jar:?]
at
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:600)
~[streamingjobs-1.13.jar:?]
at
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:587)
~[streamingjobs-1.13.jar:?]
at
org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:541)
~[streamingjobs-1.13.jar:?]
at
org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperatorFactory(StreamConfig.java:322)
~[flink-dist_2.12-1.13.1.jar:1.13.1]
at
org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:154)
~[flink-dist_2.12-1.13.1.jar:1.13.1]
at
org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:548)
~[flink-dist_2.12-1.13.1.jar:1.13.1]
at
org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:647)
~[flink-dist_2.12-1.13.1.jar:1.13.1]
at
org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:537)
~[flink-dist_2.12-1.13.1.jar:1.13.1]
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:759)
~[streamingjobs-1.13.jar:?]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
~[streamingjobs-1.13.jar:?]
at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_332]


Flink UI in Application Mode

2022-05-22 Thread Zain Haider Nemati
Hi,
Which port does flink UI run on in application mode?
If I am running 5 yarn jobs in application mode would the UI be same for
each or different ports for each?


[jira] [Created] (FLINK-27736) Pulsar sink catch watermark error

2022-05-22 Thread LuNng Wang (Jira)
LuNng Wang created FLINK-27736:
--

 Summary: Pulsar sink catch watermark error
 Key: FLINK-27736
 URL: https://issues.apache.org/jira/browse/FLINK-27736
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream, Connectors / Pulsar
Affects Versions: 1.15.0
Reporter: LuNng Wang


{code:java}
public class WatermarkDemo {

private final static String SERVICE_URL = "pulsar://localhost:6650";
private final static String ADMIN_URL = "http://localhost:8080";;

public static void main(String[] args) throws Exception {

final StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();

PulsarSource source = PulsarSource.builder()
.setServiceUrl(SERVICE_URL)
.setAdminUrl(ADMIN_URL)
.setStartCursor(StartCursor.earliest())
.setTopics("ada")

.setDeserializationSchema(PulsarDeserializationSchema.flinkSchema(new 
SimpleStringSchema()))
.setSubscriptionName("my-subscription")
.setSubscriptionType(SubscriptionType.Exclusive)
.build();

PulsarSink sink = PulsarSink.builder()
.setServiceUrl(SERVICE_URL)
.setAdminUrl(ADMIN_URL)
.setTopics("beta")

.setSerializationSchema(PulsarSerializationSchema.flinkSchema(new 
SimpleStringSchema()))
.setDeliveryGuarantee(DeliveryGuarantee.AT_LEAST_ONCE)
.build();


DataStream stream = env.fromSource(source, 
WatermarkStrategy.forMonotonousTimestamps(), "Pulsar Source");
stream.sinkTo(sink);

env.execute();

}
} {code}
It will throw the following error.
{code:java}
Caused by: java.lang.IllegalArgumentException: Invalid timestamp : '0'
    at 
org.apache.pulsar.shade.com.google.common.base.Preconditions.checkArgument(Preconditions.java:203)
    at 
org.apache.pulsar.client.impl.TypedMessageBuilderImpl.eventTime(TypedMessageBuilderImpl.java:204)
    at 
org.apache.flink.connector.pulsar.sink.writer.PulsarWriter.createMessageBuilder(PulsarWriter.java:216)
    at 
org.apache.flink.connector.pulsar.sink.writer.PulsarWriter.write(PulsarWriter.java:141)
    at 
org.apache.flink.streaming.runtime.operators.sink.SinkWriterOperator.processElement(SinkWriterOperator.java:158)
    at 
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:82)
    at 
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:57)
    at 
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:29)
    at 
org.apache.flink.streaming.runtime.tasks.SourceOperatorStreamTask$AsyncDataOutputToOutput.emitRecord(SourceOperatorStreamTask.java:313)
    at 
org.apache.flink.streaming.api.operators.source.SourceOutputWithWatermarks.collect(SourceOutputWithWatermarks.java:110)
    at 
org.apache.flink.connector.pulsar.source.reader.emitter.PulsarRecordEmitter.emitRecord(PulsarRecordEmitter.java:41)
    at 
org.apache.flink.connector.pulsar.source.reader.emitter.PulsarRecordEmitter.emitRecord(PulsarRecordEmitter.java:33)
    at 
org.apache.flink.connector.base.source.reader.SourceReaderBase.pollNext(SourceReaderBase.java:143)
    at 
org.apache.flink.connector.pulsar.source.reader.source.PulsarOrderedSourceReader.pollNext(PulsarOrderedSourceReader.java:106)
    at 
org.apache.flink.streaming.api.operators.SourceOperator.emitNext(SourceOperator.java:385)
    at 
org.apache.flink.streaming.runtime.io.StreamTaskSourceInput.emitNext(StreamTaskSourceInput.java:68)
    at 
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:519)
    at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:203)
    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:804)
    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:753)
    at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
    at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927)
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563)
    at java.lang.Thread.run(Thread.java:748) {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [DISCUSS] FLIP-232: Add Retry Support For Async I/O In DataStream API

2022-05-22 Thread Gen Luo
Thank Lincoln for the proposal!

The FLIP looks good to me. I'm in favor of the timer based implementation,
and I'd like to share some thoughts.

I'm thinking if we have to store the retry status in the state. I suppose
the retrying requests can just submit as the first attempt when the job
restores from a checkpoint, since in fact the side effect of the retries
can not draw back by the restoring. This makes the state simpler and makes
it unnecessary to do the state migration, and can be more compatible when
the user restart a job with a changed retry strategy.

Besides, I find it hard to implement a flexible backoff strategy with the
current AsyncRetryStrategy interface, for example an
ExponentialBackoffRetryStrategy. Maybe we can add a parameter of the
attempt or just use the org.apache.flink.util.concurrent.RetryStrategy to
take the place of the retry strategy part in the AsyncRetryStrategy?

Lincoln Lee  于 2022年5月20日周五 14:24写道:

> Hi everyone,
>
>By comparing the two internal implementations of delayed retries, we
> prefer the timer-based solution, which obtains precise delay control
> through simple logic and only needs to pay (what we consider to be
> acceptable) timer instance cost for the retry element.  The FLIP[1] doc has
> been updated.
>
> [1]:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211883963
>
> Best,
> Lincoln Lee
>
>
> Lincoln Lee  于2022年5月16日周一 15:09写道:
>
> > Hi Jinsong,
> >
> > Good question!
> >
> > The delayQueue is very similar to incompleteElements in
> > UnorderedStreamElementQueue, it only records the references of in-flight
> > retry elements, the core value is for the ease of a fast scan when force
> > flush during endInput and less refactor for existing logic.
> >
> > Users needn't configure a new capacity for the delayQueue, just turn the
> > original one up (if needed).
> > And separately store the input data and retry state is mainly to
> implement
> > backwards compatibility. The first version of Poc, I used a single
> combined
> > state in order to reduce state costs, but hard to keep compatibility, and
> > changed  into two via Yun Gao's concern about the compatibility.
> >
> > Best,
> > Lincoln Lee
> >
> >
> > Jingsong Li  于2022年5月16日周一 14:48写道:
> >
> >> Thanks  Lincoln for your reply.
> >>
> >> I'm a little confused about the relationship between Ordered/Unordered
> >> Queue and DelayQueue. Why do we need to have a DelayQueue?
> >> Can we remove the DelayQueue and put the state of the retry in the
> >> StreamRecordQueueEntry (seems like it's already in the FLIP)
> >> The advantages of doing this are:
> >> 1. twice less data is stored in state
> >> 2. the concept is unified, the user only needs to configure one queue
> >> capacity
> >>
> >> Best,
> >> Jingsong
> >>
> >> On Mon, May 16, 2022 at 12:10 PM Lincoln Lee 
> >> wrote:
> >>
> >> > Hi Jinsong,
> >> > Thanks for your feedback! Let me try to answer the two questions:
> >> >
> >> > For q1: Motivation
> >> > Yes, users can implement retries themselves based on the external
> async
> >> > client, but this requires each user to do similar things, and if we
> can
> >> > support retries uniformly, user code would become much simpler.
> >> >
> >> > > The real external call should happen in the asynchronous thread.
> >> > My question is: If the user makes a retry in this asynchronous thread
> by
> >> > themselves, is there a difference between this and the current FLIP's?
> >> >
> >> >
> >> > For q2: Block Main Thread
> >> > You're right, the queue data will be stored in the ListState which is
> an
> >> > OperateState, though in fact, for ListState storage, the theoretical
> >> upper
> >> > limit is Integer.MAX_VALUE, but we can't increase the queue capacity
> too
> >> > big in production because the risk of OOM increases when the queue
> >> capacity
> >> > grows, and increases the task parallelism maybe a more viable way when
> >> > encounter too many retry items for a single task.
> >> > We recommend using a proper estimate of queue capacity based on the
> >> formula
> >> > like this: 'inputRate * retryRate * avgRetryDuration', and also the
> >> actual
> >> > checkpoint duration in runtime.
> >> >
> >> > > If I understand correctly, the retry queue will be put into
> ListState,
> >> > this
> >> > state is OperatorState? As far as I know, OperatorState does not have
> >> the
> >> > ability to store a lot of data.
> >> > So after we need to retry more data, we should need to block the main
> >> > thread? What is the maximum size of the default retry queue?
> >> >
> >> >
> >> >
> >> > Best,
> >> > Lincoln Lee
> >> >
> >> >
> >> > Jingsong Li  于2022年5月16日周一 10:31写道:
> >> >
> >> > > Thank Lincoln for the proposal.
> >> > >
> >> > > ## Motivation:
> >> > >
> >> > > > asyncInvoke and callback functions are executed synchronously by
> the
> >> > main
> >> > > thread, which is not suitable adding long time blocking operations,
> >> and
> >> > > introducing additional thread will bring extra complexity 

Re: [VOTE] FLIP-91: Support SQL Gateway

2022-05-22 Thread Nicholas Jiang
+1 (non-binding)

Best,
Nicholas Jiang

On 2022/05/20 02:38:39 Shengkai Fang wrote:
> Hi, everyone.
> 
> Thanks for your feedback for FLIP-91: Support SQL Gateway[1] on the
> discussion thread[2]. I'd like to start a vote for it. The vote will be
> open for at least 72 hours unless there is an objection or not enough votes.
> 
> Best,
> Shengkai
> 
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Gateway
> [2]https://lists.apache.org/thread/gr7soo29z884r1scnz77r2hwr2xmd9b0
> 


[jira] [Created] (FLINK-27737) Clean the outdated comments and unfencedMainExecutor

2022-05-22 Thread Aitozi (Jira)
Aitozi created FLINK-27737:
--

 Summary: Clean the outdated comments and unfencedMainExecutor
 Key: FLINK-27737
 URL: https://issues.apache.org/jira/browse/FLINK-27737
 Project: Flink
  Issue Type: Technical Debt
  Components: Runtime / Coordination
Reporter: Aitozi


This ticket is meant to clean the outdated and confusing comments about main 
executor usage eg: 
[link|https://github.com/apache/flink/blob/18a967f8ad7b22c2942e227fb84f08f552660b5a/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/KubernetesResourceManagerDriver.java#L248]
 and also the unfencedMainThreadExecutor stuff in {{FencedRpcEndpoint}}.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [VOTE] FLIP-91: Support SQL Gateway

2022-05-22 Thread LuNing Wang
+1 (non-binding)

Best,
LuNing Wang

Nicholas Jiang  于2022年5月23日周一 12:57写道:

> +1 (non-binding)
>
> Best,
> Nicholas Jiang
>
> On 2022/05/20 02:38:39 Shengkai Fang wrote:
> > Hi, everyone.
> >
> > Thanks for your feedback for FLIP-91: Support SQL Gateway[1] on the
> > discussion thread[2]. I'd like to start a vote for it. The vote will be
> > open for at least 72 hours unless there is an objection or not enough
> votes.
> >
> > Best,
> > Shengkai
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Gateway
> > [2]https://lists.apache.org/thread/gr7soo29z884r1scnz77r2hwr2xmd9b0
> >
>


[DISCUSS] Initializing Apache Flink Slack

2022-05-22 Thread Xintong Song
Hi devs,

As we have approved on creating an Apache Flink Slack [1], I'm starting
this new thread to coordinate and give updates on the issues related to
initializing the slack workspace.

## Progress
1. Jark and I have worked on a draft of Slack Management Regulations [2],
including Code of Conduct, Roles and Permissions, etc. Looking forward to
feedback.
2. We have created a slack workspace for initial setups. Anyone who wants
to help with the initialization or just to take an early look, please reach
out for an invitation.
3. The URLs "apache-flink.slack.com" and "apacheflink.slack.com" have
already been taken. I've already sent a help request to the Slack team and
am waiting for their response. Before this gets resolved, we are using "
asf-flink.slack.com" for the moment.
4. I've created FLINK-27719 [3] for tracking the remaining tasks, including
setting up the auto-updated invitation link and the archive.

## How can you help
1. Take a look at the Slack Management Regulations [2] and provide your
feedback
2. Check and pick-up tasks from FLINK-27719 [3]

Best,

Xintong


[1] https://lists.apache.org/thread/d556ywochkpqxbo1yh7ojm751whtojxp

[2]
https://cwiki.apache.org/confluence/display/FLINK/WIP%3A+Slack+Management

[3] https://issues.apache.org/jira/browse/FLINK-27719


Re: [ANNOUNCE] Kubernetes Operator release-1.0 branch cut

2022-05-22 Thread Yang Wang
All the blockers and major issues have been merged into release-1.0 branch.
Just follow what we have promised, I am preparing the first release
candidate now and will share it for the community-wide review today.


Best,
Yang

Márton Balassi  于2022年5月18日周三 00:29写道:

> Thanks Gyula and Yang. Awesome!
>
> On Tue, May 17, 2022 at 4:46 PM Gyula Fóra  wrote:
>
> > Hi Flink devs!
> >
> > The release-1.0 branch has been forked from main and version numbers have
> > been upgraded accordingly.
> >
> > https://github.com/apache/flink-kubernetes-operator/tree/release-1.0
> >
> > The version on the main branch has been updated to 1.1-SNAPSHOT.
> >
> > From now on, for PRs that should be presented in 1.0.0, please make sure:
> > - Merge the PRs first to main, then backport to release-1.0 branch
> > - The JIRA ticket should be closed with the correct fix-versions
> >
> > There are still a few outstanding tickets, mainly docs/minor fixes but no
> > currently known blocker issues.
> >
> > We are working together with Yang to prepare the first RC by next monday.
> >
> > Cheers,
> > Gyula
> >
>


Re: [DISCUSS] Initializing Apache Flink Slack

2022-05-22 Thread Kyle Bendickson
Hi Xintong!

I’d love to take an early look if possible. I’m not a committer here, but I
have a role in the Apache Iceberg slack and am looking forward to the Flink
slack, particularly given the strong integration between Iceberg and Flink.

Thanks,
Kyle!
kjbendickson[at]gmail.com (preferred invite email)
kyle[at]tabular.io (work email)

On Sun, May 22, 2022 at 10:14 PM Xintong Song  wrote:

> Hi devs,
>
> As we have approved on creating an Apache Flink Slack [1], I'm starting
> this new thread to coordinate and give updates on the issues related to
> initializing the slack workspace.
>
> ## Progress
> 1. Jark and I have worked on a draft of Slack Management Regulations [2],
> including Code of Conduct, Roles and Permissions, etc. Looking forward to
> feedback.
> 2. We have created a slack workspace for initial setups. Anyone who wants
> to help with the initialization or just to take an early look, please reach
> out for an invitation.
> 3. The URLs "apache-flink.slack.com" and "apacheflink.slack.com" have
> already been taken. I've already sent a help request to the Slack team and
> am waiting for their response. Before this gets resolved, we are using "
> asf-flink.slack.com" for the moment.
> 4. I've created FLINK-27719 [3] for tracking the remaining tasks, including
> setting up the auto-updated invitation link and the archive.
>
> ## How can you help
> 1. Take a look at the Slack Management Regulations [2] and provide your
> feedback
> 2. Check and pick-up tasks from FLINK-27719 [3]
>
> Best,
>
> Xintong
>
>
> [1] https://lists.apache.org/thread/d556ywochkpqxbo1yh7ojm751whtojxp
>
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/WIP%3A+Slack+Management
>
> [3] https://issues.apache.org/jira/browse/FLINK-27719
>


Re: [DISCUSS] Initializing Apache Flink Slack

2022-05-22 Thread Xintong Song
Hi Kyle,

I've sent an invitation to your gmail.

Best,

Xintong



On Mon, May 23, 2022 at 1:39 PM Kyle Bendickson  wrote:

> Hi Xintong!
>
> I’d love to take an early look if possible. I’m not a committer here, but I
> have a role in the Apache Iceberg slack and am looking forward to the Flink
> slack, particularly given the strong integration between Iceberg and Flink.
>
> Thanks,
> Kyle!
> kjbendickson[at]gmail.com (preferred invite email)
> kyle[at]tabular.io (work email)
>
> On Sun, May 22, 2022 at 10:14 PM Xintong Song 
> wrote:
>
> > Hi devs,
> >
> > As we have approved on creating an Apache Flink Slack [1], I'm starting
> > this new thread to coordinate and give updates on the issues related to
> > initializing the slack workspace.
> >
> > ## Progress
> > 1. Jark and I have worked on a draft of Slack Management Regulations [2],
> > including Code of Conduct, Roles and Permissions, etc. Looking forward to
> > feedback.
> > 2. We have created a slack workspace for initial setups. Anyone who wants
> > to help with the initialization or just to take an early look, please
> reach
> > out for an invitation.
> > 3. The URLs "apache-flink.slack.com" and "apacheflink.slack.com" have
> > already been taken. I've already sent a help request to the Slack team
> and
> > am waiting for their response. Before this gets resolved, we are using "
> > asf-flink.slack.com" for the moment.
> > 4. I've created FLINK-27719 [3] for tracking the remaining tasks,
> including
> > setting up the auto-updated invitation link and the archive.
> >
> > ## How can you help
> > 1. Take a look at the Slack Management Regulations [2] and provide your
> > feedback
> > 2. Check and pick-up tasks from FLINK-27719 [3]
> >
> > Best,
> >
> > Xintong
> >
> >
> > [1] https://lists.apache.org/thread/d556ywochkpqxbo1yh7ojm751whtojxp
> >
> > [2]
> >
> https://cwiki.apache.org/confluence/display/FLINK/WIP%3A+Slack+Management
> >
> > [3] https://issues.apache.org/jira/browse/FLINK-27719
> >
>


[jira] [Created] (FLINK-27738) instance KafkaSink support config topic properties

2022-05-22 Thread LCER (Jira)
LCER created FLINK-27738:


 Summary: instance KafkaSink support config topic properties
 Key: FLINK-27738
 URL: https://issues.apache.org/jira/browse/FLINK-27738
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Kafka
Affects Versions: 1.15.0
Reporter: LCER


I use KafkaSink to config Kafka information as following:

*KafkaSink.builder()*
 *.setBootstrapServers(brokers)*
 *.setRecordSerializer(KafkaRecordSerializationSchema.builder()*
 *.setTopicSelector(topicSelector)*
 *.setValueSerializationSchema(new SimpleStringSchema())*
 *.build()*
 *)*
 *.setDeliverGuarantee(DeliveryGuarantee.EXACTLY_ONCE)*
 *.setKafkaProducerConfig(properties)*
 *.build();*

**

*I can't find any method to support config topic properties*



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [VOTE] FLIP-91: Support SQL Gateway

2022-05-22 Thread godfrey he
+1

Best,
Godfrey

LuNing Wang  于2022年5月23日周一 13:06写道:
>
> +1 (non-binding)
>
> Best,
> LuNing Wang
>
> Nicholas Jiang  于2022年5月23日周一 12:57写道:
>
> > +1 (non-binding)
> >
> > Best,
> > Nicholas Jiang
> >
> > On 2022/05/20 02:38:39 Shengkai Fang wrote:
> > > Hi, everyone.
> > >
> > > Thanks for your feedback for FLIP-91: Support SQL Gateway[1] on the
> > > discussion thread[2]. I'd like to start a vote for it. The vote will be
> > > open for at least 72 hours unless there is an objection or not enough
> > votes.
> > >
> > > Best,
> > > Shengkai
> > >
> > > [1]
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Gateway
> > > [2]https://lists.apache.org/thread/gr7soo29z884r1scnz77r2hwr2xmd9b0
> > >
> >


About Native Deployment's Autoscaling implementation

2022-05-22 Thread Talat Uyarer
Hi,
I am working on auto scaling support for native deployments. Today Flink
provides Reactive mode however it only runs on standalone deployments. We
use Kubernetes native deployment. So I want to increase or decrease job
resources for our streamin jobs. Recent Flip-138 and Flip-160 are very
useful to achieve this goal. I started reading code of Flink JobManager,
AdaptiveScheduler and DeclarativeSlotPool etc.

My assumption is Required Resources will be calculated on AdaptiveScheduler
whenever the scheduler receives a heartbeat from a task manager by calling
public void updateAccumulators(AccumulatorSnapshot accumulatorSnapshot)
method.

I checked TaskExecutorToJobManagerHeartbeatPayload class however I only see
*accumulatorReport* and *executionDeploymentReport* . Do you have any
suggestions to collect metrics from TaskManagers ? Should I add metrics on
TaskExecutorToJobManagerHeartbeatPayload ?

I am open to another suggestion for this. Whenever I finalize my
investigation. I will create a FLIP for more detailed implementation.

Thanks for your help in advance.
Talat


Re: [DISCUSS] FLIP-232: Add Retry Support For Async I/O In DataStream API

2022-05-22 Thread Lincoln Lee
Hi Gen Luo,
Thanks a lot for your feedback!

1. About the retry state:
I considered dropping the retry state which really simplifies state changes
and avoids compatibility handling. The only reason I changed my mind was
that it might be lossy to the user. Elements that has been tried several
times but not exhausted its retry opportunities will reset the retry state
after a job failover-restart and start the retry process again (if the
retry condition persists true), which may cause a greater delay for the
retried elements, actually retrying more times and for longer than
expected. (Although in the PoC may also have a special case when
recovering: if the remaining timeout is exhausted for the recalculation, it
will execute immediately but will have to register a timeout timer for the
async, here using an extra backoffTimeMillis)
For example, '60s fixed-delay retry if empty result, max-attempts: 5,
timeout 300s'
When checkpointing, some data has been retry 2 times, then suppose the job
is restarted and it takes 2min when the restart succeeds, if we drop the
retry state, the worst case will take more 240s(60s * 2 + 2min) delay for
users to finish retry.

For my understanding(please correct me if I missed something), if a job is
resumed from a previous state and the retry strategy is changed, the
elements that need to be recovered in the retry state just needs the new
strategy to take over the current attempts and time that has been used,  or
give up retry if no retry strategy was set.
> and can be more compatible when the user restart a job with a changed
retry strategy.

2.  About the interface, do you think it would be helpful if add the
currentAttempts into getBackoffTimeMillis()? e.g.,  long
getBackoffTimeMillis(int currentAttempts)
The existing RetryStrategy and RestartBackoffTimeStrategy were in my
candidate list but not exactly match, and I want to avoid creating the new
instances for every attempt in RetryStrategy.

WDYT?

Best,
Lincoln Lee


Gen Luo  于2022年5月23日周一 11:37写道:

> Thank Lincoln for the proposal!
>
> The FLIP looks good to me. I'm in favor of the timer based implementation,
> and I'd like to share some thoughts.
>
> I'm thinking if we have to store the retry status in the state. I suppose
> the retrying requests can just submit as the first attempt when the job
> restores from a checkpoint, since in fact the side effect of the retries
> can not draw back by the restoring. This makes the state simpler and makes
> it unnecessary to do the state migration, and can be more compatible when
> the user restart a job with a changed retry strategy.
>
> Besides, I find it hard to implement a flexible backoff strategy with the
> current AsyncRetryStrategy interface, for example an
> ExponentialBackoffRetryStrategy. Maybe we can add a parameter of the
> attempt or just use the org.apache.flink.util.concurrent.RetryStrategy to
> take the place of the retry strategy part in the AsyncRetryStrategy?
>
> Lincoln Lee  于 2022年5月20日周五 14:24写道:
>
> > Hi everyone,
> >
> >By comparing the two internal implementations of delayed retries, we
> > prefer the timer-based solution, which obtains precise delay control
> > through simple logic and only needs to pay (what we consider to be
> > acceptable) timer instance cost for the retry element.  The FLIP[1] doc
> has
> > been updated.
> >
> > [1]:
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211883963
> >
> > Best,
> > Lincoln Lee
> >
> >
> > Lincoln Lee  于2022年5月16日周一 15:09写道:
> >
> > > Hi Jinsong,
> > >
> > > Good question!
> > >
> > > The delayQueue is very similar to incompleteElements in
> > > UnorderedStreamElementQueue, it only records the references of
> in-flight
> > > retry elements, the core value is for the ease of a fast scan when
> force
> > > flush during endInput and less refactor for existing logic.
> > >
> > > Users needn't configure a new capacity for the delayQueue, just turn
> the
> > > original one up (if needed).
> > > And separately store the input data and retry state is mainly to
> > implement
> > > backwards compatibility. The first version of Poc, I used a single
> > combined
> > > state in order to reduce state costs, but hard to keep compatibility,
> and
> > > changed  into two via Yun Gao's concern about the compatibility.
> > >
> > > Best,
> > > Lincoln Lee
> > >
> > >
> > > Jingsong Li  于2022年5月16日周一 14:48写道:
> > >
> > >> Thanks  Lincoln for your reply.
> > >>
> > >> I'm a little confused about the relationship between Ordered/Unordered
> > >> Queue and DelayQueue. Why do we need to have a DelayQueue?
> > >> Can we remove the DelayQueue and put the state of the retry in the
> > >> StreamRecordQueueEntry (seems like it's already in the FLIP)
> > >> The advantages of doing this are:
> > >> 1. twice less data is stored in state
> > >> 2. the concept is unified, the user only needs to configure one queue
> > >> capacity
> > >>
> > >> Best,
> > >> Jingsong
> > >>
> > >> On Mon, May 16, 2022 at 12:10 PM Linc

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-22 Thread Qingsheng Ren
Hi devs,

We recently updated FLIP-221 [1] in order to make the concept of caching 
clearer and introduce some new common lookup table options. The changes are 
listed below:

1. "LRU cache" and “all cache” have been renamed as “partial cache” and “full 
cache”. It is not necessary for the cache to using only size-based LRU evection 
policy so using “partial” will be more precise. This change was inspired by the 
definition in Microsoft SQL Server [2][3]. “RescanRuntimeProvider” has been 
renamed to “FullCachingLookupProvider” accordingly.

2. Common lookup options are introduced in the “Table Options for Lookup Cache” 
section to unify the usage of lookup tables.

Looking forward to your feedbacks!

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-221%3A+Abstraction+for+lookup+source+cache+and+metric
[2] 
https://docs.microsoft.com/en-us/sql/integration-services/connection-manager/lookup-transformation-full-cache-mode-cache-connection-manager
[3] 
https://docs.microsoft.com/en-us/sql/integration-services/data-flow/transformations/implement-a-lookup-in-no-cache-or-partial-cache-mode

Best regards, 

Qingsheng

> On May 18, 2022, at 17:04, Qingsheng Ren  wrote:
> 
> Hi Jark and Alexander, 
> 
> Thanks for your comments! I’m also OK to introduce common table options. I 
> prefer to introduce a new DefaultLookupCacheOptions class for holding these 
> option definitions because putting all options into FactoryUtil would make it 
> a bit ”crowded” and not well categorized. 
> 
> FLIP has been updated according to suggestions above: 
> 1. Use static “of” method for constructing RescanRuntimeProvider considering 
> both arguments are required.
> 2. Introduce new table options matching DefaultLookupCacheFactory
> 
> Best,
> Qingsheng
> 
> On Wed, May 18, 2022 at 2:57 PM Jark Wu  wrote:
> Hi Alex,
> 
> 1) retry logic
> I think we can extract some common retry logic into utilities, e.g. 
> RetryUtils#tryTimes(times, call). 
> This seems independent of this FLIP and can be reused by DataStream users. 
> Maybe we can open an issue to discuss this and where to put it. 
> 
> 2) cache ConfigOptions
> I'm fine with defining cache config options in the framework. 
> A candidate place to put is FactoryUtil which also includes 
> "sink.parallelism", "format" options.
> 
> Best,
> Jark
> 
> 
> On Wed, 18 May 2022 at 13:52, Александр Смирнов  wrote:
> Hi Qingsheng,
> 
> Thank you for considering my comments.
> 
> >  there might be custom logic before making retry, such as re-establish the 
> > connection
> 
> Yes, I understand that. I meant that such logic can be placed in a
> separate function, that can be implemented by connectors. Just moving
> the retry logic would make connector's LookupFunction more concise +
> avoid duplicate code. However, it's a minor change. The decision is up
> to you.
> 
> > We decide not to provide common DDL options and let developers to define 
> > their own options as we do now per connector.
> 
> What is the reason for that? One of the main goals of this FLIP was to
> unify the configs, wasn't it? I understand that current cache design
> doesn't depend on ConfigOptions, like was before. But still we can put
> these options into the framework, so connectors can reuse them and
> avoid code duplication, and, what is more significant, avoid possible
> different options naming. This moment can be pointed out in
> documentation for connector developers.
> 
> Best regards,
> Alexander
> 
> вт, 17 мая 2022 г. в 17:11, Qingsheng Ren :
> >
> > Hi Alexander,
> >
> > Thanks for the review and glad to see we are on the same page! I think you 
> > forgot to cc the dev mailing list so I’m also quoting your reply under this 
> > email.
> >
> > >  We can add 'maxRetryTimes' option into this class
> >
> > In my opinion the retry logic should be implemented in lookup() instead of 
> > in LookupFunction#eval(). Retrying is only meaningful under some specific 
> > retriable failures, and there might be custom logic before making retry, 
> > such as re-establish the connection (JdbcRowDataLookupFunction is an 
> > example), so it's more handy to leave it to the connector.
> >
> > > I don't see DDL options, that were in previous version of FLIP. Do you 
> > > have any special plans for them?
> >
> > We decide not to provide common DDL options and let developers to define 
> > their own options as we do now per connector.
> >
> > The rest of comments sound great and I’ll update the FLIP. Hope we can 
> > finalize our proposal soon!
> >
> > Best,
> >
> > Qingsheng
> >
> >
> > > On May 17, 2022, at 13:46, Александр Смирнов  wrote:
> > >
> > > Hi Qingsheng and devs!
> > >
> > > I like the overall design of updated FLIP, however I have several
> > > suggestions and questions.
> > >
> > > 1) Introducing LookupFunction as a subclass of TableFunction is a good
> > > idea. We can add 'maxRetryTimes' option into this class. 'eval' method
> > > of new LookupFunction is great for this purpose. The same is for
> > > '

Re: [DISCUSS] FLIP-221 Abstraction for lookup source cache and metric

2022-05-22 Thread Qingsheng Ren
Hi Alexander,

Thanks for the review! We recently updated the FLIP and you can find those 
changes from my latest email. Since some terminologies has changed so I’ll use 
the new concept for replying your comments. 

1. Builder vs ‘of’
I’m OK to use builder pattern if we have additional optional parameters for 
full caching mode (“rescan” previously). The schedule-with-delay idea looks 
reasonable to me, but I think we need to redesign the builder API of full 
caching to make it more descriptive for developers. Would you mind sharing your 
ideas about the API? For accessing the FLIP workspace you can just provide your 
account ID and ping any PMC member including Jark. 

2. Common table options
We have some discussions these days and propose to introduce 8 common table 
options about caching. It has been updated on the FLIP. 

3. Retries
I think we are on the same page :-)

For your additional concerns:
1) The table option has been updated.
2) We got “lookup.cache” back for configuring whether to use partial or full 
caching mode.

Best regards, 

Qingsheng



> On May 19, 2022, at 17:25, Александр Смирнов  wrote:
> 
> Also I have a few additions:
> 1) maybe rename 'lookup.cache.maximum-size' to
> 'lookup.cache.max-rows'? I think it will be more clear that we talk
> not about bytes, but about the number of rows. Plus it fits more,
> considering my optimization with filters.
> 2) How will users enable rescanning? Are we going to separate caching
> and rescanning from the options point of view? Like initially we had
> one option 'lookup.cache' with values LRU / ALL. I think now we can
> make a boolean option 'lookup.rescan'. RescanInterval can be
> 'lookup.rescan.interval', etc.
> 
> Best regards,
> Alexander
> 
> чт, 19 мая 2022 г. в 14:50, Александр Смирнов :
>> 
>> Hi Qingsheng and Jark,
>> 
>> 1. Builders vs 'of'
>> I understand that builders are used when we have multiple parameters.
>> I suggested them because we could add parameters later. To prevent
>> Builder for ScanRuntimeProvider from looking redundant I can suggest
>> one more config now - "rescanStartTime".
>> It's a time in UTC (LocalTime class) when the first reload of cache
>> starts. This parameter can be thought of as 'initialDelay' (diff
>> between current time and rescanStartTime) in method
>> ScheduleExecutorService#scheduleWithFixedDelay [1] . It can be very
>> useful when the dimension table is updated by some other scheduled job
>> at a certain time. Or when the user simply wants a second scan (first
>> cache reload) be delayed. This option can be used even without
>> 'rescanInterval' - in this case 'rescanInterval' will be one day.
>> If you are fine with this option, I would be very glad if you would
>> give me access to edit FLIP page, so I could add it myself
>> 
>> 2. Common table options
>> I also think that FactoryUtil would be overloaded by all cache
>> options. But maybe unify all suggested options, not only for default
>> cache? I.e. class 'LookupOptions', that unifies default cache options,
>> rescan options, 'async', 'maxRetries'. WDYT?
>> 
>> 3. Retries
>> I'm fine with suggestion close to RetryUtils#tryTimes(times, call)
>> 
>> [1] 
>> https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ScheduledExecutorService.html#scheduleWithFixedDelay-java.lang.Runnable-long-long-java.util.concurrent.TimeUnit-
>> 
>> Best regards,
>> Alexander
>> 
>> ср, 18 мая 2022 г. в 16:04, Qingsheng Ren :
>>> 
>>> Hi Jark and Alexander,
>>> 
>>> Thanks for your comments! I’m also OK to introduce common table options. I 
>>> prefer to introduce a new DefaultLookupCacheOptions class for holding these 
>>> option definitions because putting all options into FactoryUtil would make 
>>> it a bit ”crowded” and not well categorized.
>>> 
>>> FLIP has been updated according to suggestions above:
>>> 1. Use static “of” method for constructing RescanRuntimeProvider 
>>> considering both arguments are required.
>>> 2. Introduce new table options matching DefaultLookupCacheFactory
>>> 
>>> Best,
>>> Qingsheng
>>> 
>>> On Wed, May 18, 2022 at 2:57 PM Jark Wu  wrote:
 
 Hi Alex,
 
 1) retry logic
 I think we can extract some common retry logic into utilities, e.g. 
 RetryUtils#tryTimes(times, call).
 This seems independent of this FLIP and can be reused by DataStream users.
 Maybe we can open an issue to discuss this and where to put it.
 
 2) cache ConfigOptions
 I'm fine with defining cache config options in the framework.
 A candidate place to put is FactoryUtil which also includes 
 "sink.parallelism", "format" options.
 
 Best,
 Jark
 
 
 On Wed, 18 May 2022 at 13:52, Александр Смирнов  
 wrote:
> 
> Hi Qingsheng,
> 
> Thank you for considering my comments.
> 
>> there might be custom logic before making retry, such as re-establish 
>> the connection
> 
> Yes, I understand that. I meant that such logic can be placed in a
> s