Unable to start job using Flink Operator

2022-07-28 Thread Geldenhuys, Morgan Karl
Greetings all, I am attempting to start a flink job using the Flink oeprator (version 1.1) however am running into a problem. While attempting to create the deployment i receive the following error: Resource: "flink.apache.org/v1beta1, Resource=flinkdeployments", GroupVersionKind: "flink.apa

Advice needed: Flink Kubernetes Operator with Prometheus Configuration

2022-06-23 Thread Geldenhuys, Morgan Karl
Greetings all, I am trying to deploy Flink jobs using the Flink Kubernetes Operator and I would like to have Prometheus scrape metrics from the various pods. The jobs are created successfully, however, the metrics don't appear to be available. The following steps were followed based on the

Flink Operator 1.0.0 not working

2022-06-08 Thread Geldenhuys, Morgan Karl
Greetings all, I am trying to get the flink operator (https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.0/) working, however running into a number of issues. I have a fresh Kubernetes cluster running and have followed all the instructions for deploying the operator

Resizing kube container sizes dynamically for custom jobs

2022-05-11 Thread Geldenhuys, Morgan Karl
Greetings all, I have a question concerning resource allocation for Apache flink. I have a flink native session cluster running and im interested in rolling out multiple jobs. However, I would like to size the container resources (CPU and Memory) differently for each job, is this possible? i.e

Re: How to reduce interval between Uptime Metric meaasurements?

2021-12-31 Thread Geldenhuys, Morgan Karl
Thanks for the hint, however i am not using the prometheus push gateway. Regards, M. From: Caizhi Weng Sent: 28 December 2021 02:17:34 To: Geldenhuys, Morgan Karl Cc: user@flink.apache.org Subject: Re: How to reduce interval between Uptime Metric meaasurements

How to reduce interval between Uptime Metric meaasurements?

2021-12-27 Thread Geldenhuys, Morgan Karl
Hello everyone, I have a flink 1.14 job running and im looking at the uptime metric (flink_jobmanager_job_uptime) together with prometheus (scrape every second). It looks as if this metric is updated every 60 seconds, is there a way of decreasing this interval? A fixed delay recovery strategy

Rescaling feature disabled for Flink 1.14

2021-12-15 Thread Geldenhuys, Morgan Karl
Greetings all, I am trying to test the rescaling feature of Flink 1.14, however when i send a rest request to the endpoint I receive the following message: "org.apache.flink.runtime.rest.handler.RestHandlerException: Rescaling is temporarily disabled. See FLINK-12312. \tat org.apache.flink.r

Information request: Reactive mode and Rescaling

2021-12-15 Thread Geldenhuys, Morgan Karl
Greetings, I would like to find out more about Flink's new reactive mode as well as the rescaling feature regarding fault tolerance. For the following question lets assume checkpointing is enabled using HDFS. So first question, if I have a job where the source(s) and sink(s) are configured t

Re: Latency monitoring in Flink 1.14.0

2021-12-13 Thread Geldenhuys, Morgan Karl
see this: https://stackguides.com/questions/68917956/read-flink-latency-tracking-metric-in-datadog Also `metrics.latency.granularity` must be set in the Flink configuration. Not sure if `-D` forwards this properly. Timo On 10.12.21 18:31, Geldenhuys, Morgan Karl wrote: > Greetings

Latency monitoring in Flink 1.14.0

2021-12-10 Thread Geldenhuys, Morgan Karl
Greetings all, I am attempting to setup latency monitoring for a flink 1.14.0 job. According to the documentation, I have done the following: In my kubernetes setup I have added the following to the kubernetes-sess

How to unsubscribe?

2021-06-08 Thread Geldenhuys , Morgan Karl
How can I unsubscribe to this mailing lists? The volume of is just getting too much at the moment. Following the steps described in the website (https://flink.apache.org/community.html) did not appear to do anything. Sorry for the spam and thanks in advance.

Re: Checkpoint timeouts at times of high load

2021-04-05 Thread Geldenhuys, Morgan Karl
e some detailed information of checkpoint? For example, the detailed checkpoint information from the web.[1] And which Flink version do you use? [1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/ops/monitoring/checkpoint_monitoring.html Best, Guowei On Thu, Apr 1, 2021 at 4:33 P

Checkpoint timeouts at times of high load

2021-04-01 Thread Geldenhuys, Morgan Karl
Hi Community, I have a number of flink jobs running inside my session cluster with varying checkpoint intervals plus a large amount of operator state and in times of high load, the jobs fail due to checkpoint timeouts (set to 6 minutes). I can only assume this is because the latencies for savi