Re: [DISCUSS] Creating an external connector repository

2021-10-19 Thread Jark Wu
Hi, I think Thomas raised very good questions and would like to know your opinions if we want to move connectors out of flink in this version. (1) is the connector API already stable? > Separate releases would only make sense if the core Flink surface is > fairly stable though. As evident from Ic

Re: [External] metric collision using datadog and standalone Kubernetes HA mode

2021-10-19 Thread Clemens Valiente
Hi Chesnay, thanks a lot for the clarification. We managed to resolve the collision, and isolated a problem to the metrics themselves. Using the REST API at /jobs//metrics?get=uptime the response is [{"id":"uptime","value":"-1"}] despite the job running and processing data for 5 days at that point

Re: Flink ignoring latest checkpoint on restart?

2021-10-19 Thread LeVeck, Matt
Thanks David. We're running Flink 3.4.10. I don't see anything immediately standing out in the zookeeper logs. I think what would help perhaps as much as help diagnosing is the following. When it does fail to find the lastest checkpoint in Zookeeper, Flink seems to go to the checkpoint in th

Re: Flink 1.14 doesn’t suppport kafka consummer 0.11 or lower?

2021-10-19 Thread Qingsheng Ren
Hi Jary, Flink removed Kafka 0.10 & 0.11 connector since 1.12, because Kafka supports bidirectional compatibility since version 0.10, which means you can use a newer version client to communicate with your old version broker (e.g. Kafka client 2.4.1 & Kafka broker 0.11) [1]. You can try to swit

Flink 1.14 doesn’t suppport kafka consummer 0.11 or lower?

2021-10-19 Thread Jary Zhen
Hi, everyone I'm using Flink 1.14 to consume Kafka data, which version is 0.11. And there are some errors while running. Caused by: java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.poll(Ljava/time/Duration;)Lorg/apache/kafka/clients/consumer/ConsumerRecords; at org.apa

Re: Impossible to get pending file names/paths on checkpoint?

2021-10-19 Thread Preston Price
So Fabian, I wanted to follow up on something, perhaps you can weigh in. I had previously made the claim that getting things working with ADLS would be trivial, but that has turned out not to be the case in Flink 1.14. I have a sink that works in Flink 1.10 based off the old BucketingSink

Re: Troubleshooting checkpoint timeout

2021-10-19 Thread Caizhi Weng
Hi! I see you're using sliding event time windows. What's the exact value of windowLengthMinutes and windowSlideTimeMinutes? If windowLengthMinutes is large and windowSlideTimeMinutes is small then each record may be assigned to a large number of windows as the pipeline proceeds, thus gradually sl

Unable to create connection to Azure Data Lake Gen2 with abfs: "Configuration property {storage_account}.dfs.core.windows.net not found"

2021-10-19 Thread Preston Price
Some details about my runtime/environment: Java 11 Flink version 1.14.0 Running locally in IntelliJ The error message that I am getting is: Configuration property {storage_account}.dfs.core.windows.net not found. Reading through all the docs hasn't yielded much help. In the Flink docs here

Write savepoint from test harness

2021-10-19 Thread Mike Barborak
Hi, I am using the KeyedOneInputStreamOperatorTestHarness. With that, I can take a snapshot and then use OperatorSnapshotUtil to write and read it. I am wondering if I can take a savepoint using the test harness or write the snapshot as a savepoint in order to use the ExistingSavepoint API to e

Metric Scopes purpose

2021-10-19 Thread JP MB
Hello, I have been playing with metric scopes and I'm not sure if I understood them correctly. https://ci.apache.org/projects/flink/flink-docs-master/docs/ops/metrics/ For instance, - metrics.scope.task - Default: .taskmanager - Applied to all metrics that were scoped to

Re: Programmatically configuring S3 settings

2021-10-19 Thread Pavel Penkov
I've placed a flink-conf.yaml file in conf dir but StreamExecutionEnvironment.getExecutionEnvironment doesn't pick it up. If set programmatically keys are visible in Flink Web UI, they are just not passed to Hadoop FS. On 2021/10/18 03:04:04, Yangze Guo wrote: > Hi, Pavel.> > > From my understand

Re: [DISCUSS] Creating an external connector repository

2021-10-19 Thread Chesnay Schepler
Could you clarify what release cadence you're thinking of? There's quite a big range that fits "more frequent than Flink" (per-commit, daily, weekly, bi-weekly, monthly, even bi-monthly). On 19/10/2021 14:15, Martijn Visser wrote: Hi all, I think it would be a huge benefit if we can achieve m

Re: Flink 1.14.0 reactive mode cannot rescale

2021-10-19 Thread David Morávek
Fixed in context of FLINK-22815 just means that the feature set described in this issue have been delivered. In this case it means that unaligned checkpoints have been disabled. D. On Tue, Oct 19, 2021 at 2:22 PM ChangZhuo Chen (陳昌倬) wrote: > On Tue, Oct 19, 2021 at 11:51:44AM +0200, David Morá

Re: [DISCUSS] Creating an external connector repository

2021-10-19 Thread Chesnay Schepler
TBH I think you're overestimating how much work it is to create a non-Flink release. Having done most of the flink-shaded releases, I really don't see an issue of even doing weekly releases with that process. We can not reduce the number of votes AFAIK; the ASF seems very clear on that matter

Re: [DISCUSS] Creating an external connector repository

2021-10-19 Thread Dawid Wysakowicz
Hey all, I don't have much to add to the general discussion. Just a single comment on: that we could adjust the bylaws for the connectors such that we need fewer PMCs to approve a release. Would it be enough to have one PMC vote per connector release? I think it's not an option. This

Re: [DISCUSS] Creating an external connector repository

2021-10-19 Thread Konstantin Knauf
Thank you, Arvid & team, for working on this. I would also favor one connector repository under the ASF. This will already force us to provide better tools and more stable APIs, which connectors developed outside of Apache Flink will benefit from, too. Besides simplifying the formal release proce

Re: Flink 1.14.0 reactive mode cannot rescale

2021-10-19 Thread 陳昌倬
On Tue, Oct 19, 2021 at 11:51:44AM +0200, David Morávek wrote: > Hi ChangZhuo, > > this seems to be a current limitation of the unaligned checkpoints [1], are > you using any broadcasted streams in your application? > > [1] https://issues.apache.org/jira/browse/FLINK-22815 * Yes, we do have broa

Re: [DISCUSS] Creating an external connector repository

2021-10-19 Thread Arvid Heise
Okay I think it is clear that the majority would like to keep connectors under the Apache Flink umbrella. That means we will not be able to have per-connector repositories and project management, automatic dependency bumping with Dependabot, or semi-automatic releases. So then I'm assuming the dir

Re: [DISCUSS] Creating an external connector repository

2021-10-19 Thread Martijn Visser
Hi all, I think it would be a huge benefit if we can achieve more frequent releases of connectors, which are not bound to the release cycle of Flink itself. I agree that in order to get there, we need to have stable interfaces which are trustworthy and reliable, so they can be safely used by those

Re: [External] : Timeout settings for Flink jobs?

2021-10-19 Thread Arvid Heise
Yes, external orchestration sounds like the best idea. Alternatively, you can try to reach the job manager from a sink subtask and use REST API to trigger such stop-with-savepoint. [1] Jobmanager should be accessible anyways from your task managers. [1] https://ci.apache.org/projects/flink/flink-d

Troubleshooting checkpoint timeout

2021-10-19 Thread Alexis Sarda-Espinosa
Hello everyone, I am doing performance tests for one of our streaming applications and, after increasing the throughput a bit (~500 events per minute), it has started failing because checkpoints cannot be completed within 10 minutes. The Flink cluster is not exactly under my control and is runn

Re: Flink ignoring latest checkpoint on restart?

2021-10-19 Thread David Morávek
Hi Matt, this seems interesting, I'm aware of some possible inconsistency issues with unstable connections [1], but I have to yet find out if this could be related. I'll do some research on this and will get back to you. In the meantime, can you see anything relevant in the zookeeper logs? Also w

Re: Flink 1.14.0 reactive mode cannot rescale

2021-10-19 Thread David Morávek
Hi ChangZhuo, this seems to be a current limitation of the unaligned checkpoints [1], are you using any broadcasted streams in your application? [1] https://issues.apache.org/jira/browse/FLINK-22815 Best, D. On Tue, Oct 19, 2021 at 3:58 AM ChangZhuo Chen (陳昌倬) wrote: > Hi, > > We found that F

Flink ignoring latest checkpoint on restart?

2021-10-19 Thread LeVeck, Matt
My team and I could use some help debugging the following issue, and may understanding Flink's full checkpoint recovery decision tree: We've seen a few times a scenario where a task restarts (but not the job manager), a recent checkpoint is saved. But upon coming back up Flink chooses a much o

Re: Let PubSubSource support changing subscriptions?

2021-10-19 Thread Shiao-An Yuan
Hi Arvid, Thank you for the suggestion. I have created a ticket: https://issues.apache.org/jira/browse/FLINK-24587 Thanks, sayuan On Mon, Oct 18, 2021 at 4:45 PM Arvid Heise wrote: > Hi Sayuan, > > I'm not familiar with PubSub and can't assess if that's a valid request or > not. Maybe Niels ca

Re: [External] : Re: How to enable customize logging library based on SLF4J for Flink deployment in Kubernetes

2021-10-19 Thread Chesnay Schepler
1) Adding it as a dependency to the Flink application does not work with an actual Flink cluster, because said dependency must be available when the cluster is /started/. It works in the IDE because there everything is put onto the same classpath. 2) folder structure shouldn't be relevant. So

Re: Re: How to deserialize Avro enum type in Flink SQL?

2021-10-19 Thread Peter Schrott
Hi & thanks, with your solution you are referring the the reported exception: `Found my.type.avro.MyEnumType, expecting union` I investigated on the "union" part and added "NOT NULL" to the SQL statement, such that the attribute is NOT nullable on avro AND SQL. This actually "fixed" the report

Re: Avro UTF-8 deserialization on integration DataStreams API -> Table API (with Confluent SchemaRegistry)

2021-10-19 Thread Peter Schrott
Update: I assume you are talking about DataStreamSource.process(.), right? ( https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/streaming/api/datastream/DataStream.html#process-org.apache.flink.streaming.api.functions.ProcessFunction- ) So similar to a .map(.) fun

Re: Avro UTF-8 deserialization on integration DataStreams API -> Table API (with Confluent SchemaRegistry)

2021-10-19 Thread Caizhi Weng
Hi! Sorry for misleading. I mean DataStream#process, see https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/streaming/api/datastream/DataStream.html#process-org.apache.flink.streaming.api.functions.ProcessFunction- Peter Schrott 于2021年10月19日周二 下午3:10写道: > Hi &

Re: Avro UTF-8 deserialization on integration DataStreams API -> Table API (with Confluent SchemaRegistry)

2021-10-19 Thread Peter Schrott
Hi & thanks! DataStreamSource does not provide a method processRecord: https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/streaming/api/datastream/DataStreamSource.html Can you point me to the docs for that? Thanks, Peter On Tue, Oct 19, 2021 at 4:47 AM Caizhi