Re: Tracking Total Metrics Reported

2021-09-15 Thread Yangze Guo
Hi, Mason AFAIK the JM does not report the total number of metrics it has. Maybe you can stats it of each entity through [1]? [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/ops/metrics/#rest-api-integration Best, Yangze Guo On Thu, Sep 16, 2021 at 9:30 AM Mason Chen wrote: >

Re: FlinkSQL Sinks

2021-09-15 Thread JING ZHANG
Hi, I agree with Martijn. Besides, there is an example in SQL client documentation[1] which contain an external sink instead of print sink. > Flink SQL> CREATE TABLE pageviews (> user_id BIGINT,> page_id BIGINT,> > viewtime TIMESTAMP,> proctime AS PROCTIME()> ) WITH (> 'connector' = >

Re: Tracking Total Metrics Reported

2021-09-15 Thread Mason Chen
For it to be most useful, the user should be able to obtain the total number of counters, gauges, meters, and histograms, separately. On Wed, Sep 15, 2021 at 6:23 PM Mason Chen wrote: > Hi all, > > Does Flink have any sort of feature to track the total number of metrics > reported by the Flink j

Tracking Total Metrics Reported

2021-09-15 Thread Mason Chen
Hi all, Does Flink have any sort of feature to track the total number of metrics reported by the Flink job? Ideally, the total would be reported by the job manager. Even if there is a log that exposes this information, that would be helpful! Best, Mason

Re: KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-15 Thread David Morávek
If we are shutting down any sources of unbounded jobs that run on Flink versions without FLIP-147 (available in 1.14) [1], that Matthias has mentioned, than it's IMO a bug, because it effectively breaks checkpointing. Fabian, can you please verify whether this is an intended behavior? In the meant

Re: KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-15 Thread Lars Skjærven
Got it. So the workaround for now (1.13.2) is to fall back to FlinkKafkaConsumer if I read you correctly. Thanks L On Wed, Sep 15, 2021 at 2:58 PM Matthias Pohl wrote: > Hi Lars, > I guess you are looking > for execution.checkpointing.checkpoints-after-tasks-finish.enabled [1]. > This configurat

Flink S3A failed to connect to service endpoint from IntelliJ IDE

2021-09-15 Thread James Kim
I'm trying to write Java code on IntelliJ IDE to make use of the Table API and the data I'm using is going to be from a CSV file over s3a. The IDE project is in Maven and has a pom.xml like the following: http://maven.apache.org/POM/4.0.0"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-in

Re: CEP library support in Python

2021-09-15 Thread Pedro Silva
Understood, I was looking for a way to define these metrics that is attainable for non-programmers to develop. Thank you for the answer Seth Pedro > On 15 Sep 2021, at 18:38, Seth Wiesman wrote: > >  > Honestly, I don't think you need CEP or MATCH_RECOGNIZE for that use case. It > can be s

Re: Beam with Flink runner - Issues when writing to S3 in Parquet Format

2021-09-15 Thread Kathula, Sandeep
Hi Jan, Thanks for the reply. To answer your questions: 1. We are using RocksDB as backend. 2. We are using 10 minutes checkpointing interval. 3. We are getting 5,000 records per second at max each with size of around 5KB from Kafka (25 MB/sec) which we are trying to write to

Re: CEP library support in Python

2021-09-15 Thread Seth Wiesman
Honestly, I don't think you need CEP or MATCH_RECOGNIZE for that use case. It can be solved with a simple process function that tracks the state for each id. Output a 1 when a job completes and a -1 if canceled. Output the sum. You can use a simple timer to clear the state for a job after 6 months

Re: CEP library support in Python

2021-09-15 Thread Pedro Silva
Hello, As anyone used streaming sql pattern matching as shown in this email thread to count certain transitions on a stream? Is it feasible? Thank you, Pedro Silva > On 13 Sep 2021, at 11:16, Pedro Silva wrote: > >  > Hello Seth, > > Thank you very much for your reply. I've taken a look at

Re: Flink Kafka Partition Discovery Never Forgets Deleted Partitions

2021-09-15 Thread Seth Wiesman
I just want to chime in that if you really do need to drop a partition, Flink already supports a solution. If you manually stop the job with a savepoint and restart it with a new UID on the source operator, along with passing the --allowNonRestoredState flag to the client, the source will disregar

Re: FlinkSQL Sinks

2021-09-15 Thread Martijn Visser
Hi, You can do this directly via the SQL client by defining (for example Kafka) as a TABLE [1] and using an INSERT INTO [2]. Best regards, Martijn [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/connectors/table/kafka/ [2] https://ci.apache.org/projects/flink/flink-docs-re

FlinkSQL Sinks

2021-09-15 Thread Pedro Silva
Hello, Is it possible to configure a sink for sql client queries other than the terminal/stdout? Looking at the SQL Client Configuration , it seems that the output of the client is always to

Re: [ANNOUNCE] Flink mailing lists archive service has migrated to Apache Archive service

2021-09-15 Thread Matthias Pohl
Thanks Leonard for the announcement. I guess that is helpful. @Robert is there any way we can change the default setting to something else (e.g. greater than 0 days)? Only having the last month available as a default is kind of annoying considering that the time setting is quite hidden. Matthias

Re: KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-15 Thread Matthias Pohl
Hi Lars, I guess you are looking for execution.checkpointing.checkpoints-after-tasks-finish.enabled [1]. This configuration parameter is going to be introduced in the upcoming Flink 1.14 release. Best, Matthias [1] https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#execu

Re: Flink Kafka Partition Discovery Never Forgets Deleted Partitions

2021-09-15 Thread David Morávek
I'll try to be more direct with the answer as you already have the context on what the issue is. When this happens we basically have these options: 1) We can throw an exception (with good wording, so user knows what's going on) and fail the job. This forces user to take an immediate action and fi

IAM Roles with Service Account on Flink 1.12 Running on Kubernetes - Seeing Errors

2021-09-15 Thread Rayan Ahmed
Hi, I am trying to use IAM Roles with Service Accounts on Flink 1.12 running on Kubernetes. Previously I was using KIAM to provide identification to the pods and that works fine. However, when switching to use IRSA, I see the following errors (posted below). Has anyone experienced a similar is

KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-15 Thread Lars Skjærven
Using KafkaSource builder with a job parallelism larger than the number of kafka partitions, the job is unable to checkpoint. With a job parallelism of 4, 3 of the tasks are marked as FINISHED for the kafka topic with one partition. For this reason checkpointing seems to be disabled. When using F

Re: JVM Metaspace capacity planning

2021-09-15 Thread Puneet Duggal
Thank you guys, this documentation exactly lists out the issues that i am facing. > On 14-Sep-2021, at 2:14 PM, Guowei Ma wrote: > > Hi, Puneet > In general every job has its own classloader. You could find more detailed > information from doc [1]. > You could put some common jar into the "/

Re: Streaming SQL support for redis streaming connector

2021-09-15 Thread Leonard Xu
Hi, Osada Just want to offer some material here.The flink-cdc-connectors project [1] maybe also help you, we supports the document db MongoDB[2] recently. Best, Leonard [1] https://github.com/ververica/flink-cdc-connectors [2] https://verve