Flink k8s operator

2024-12-11 Thread Xiang Wang via user
Hello Flink group, I am currently evaluating Flink K8s operator to manage our Flink applications, but there are some concerns raised by our K8s cluster admins which I hope to get answers from this group: 1. What is the webhook used for? The documentation of webhook is very limited. Am I risking lo

Unsubscribe

2024-11-28 Thread Luyi Wang
Unsubscribe

Re: Cdc on HA postgresql

2024-11-25 Thread Hongshun Wang
Hi, Steffen Currently, PostgreSQL database not support this feather. Although slave PostgreSQL can also provide logical replication since PG 16+, the replication slot is not existed when checkout database. If you want to use postgres cdc with HA setup, you have to create slots at all the instanc

Re: Handling data skewness

2024-08-19 Thread Lei Wang
You can just use Math.abs(key.hashCode()) % numPartitions Regards, Lei On Mon, Aug 19, 2024 at 5:41 PM Karthick wrote: > Thanks Lei, will check it out. Please suggest me a Algorithm which > solves the problem if any. > > On Mon, Aug 19, 2024 at 2:17 PM Lei Wang wrote: >

Re: Handling data skewness

2024-08-19 Thread Lei Wang
Hi Karthick, Take a look at the distribution of your keys to see if there's some keys that contribute most of the data. If the distrubution is relatively uniform,try to use partitionCustomer with a self-defined partion function instead of keyBy. The default partition function in flink implements a

Easily deployed shared file system that flink can use

2024-08-12 Thread Lei Wang
We deploy flink with standalone mode. We don't use hadoop so there's no hdfs. Also it is locally deployed and there's no cloud storage such as GFS、 s3、 oss can be accessed. Is there any other easily deployed file system that flink checkpoint/savepoint can use and share between task managers? Thank

Re: KafkaSink self-defined RoundRobinPartitioner not able to discover new partitions

2024-07-31 Thread Lei Wang
27;s a bug in RoundRobinPartitioner and the distribution is still uneven https://issues.apache.org/jira/browse/KAFKA-9965 On Tue, Jul 30, 2024 at 4:20 PM Lei Wang wrote: > I wrote a FlinkRoundRobinPartitioner extends FlinkKafkaPartitioner and use > it as following: > &g

Re: [Request Help] Flink StreamRecord granularity latency metrics

2024-07-31 Thread Lei Wang
Hi Yubin, We implement it in this manner. For every record, we define several time fields. When the record first enters the system, set one field to current time. After several complex calculation operator, set another field to currentTime. Just calculate the difference between the two values. Hop

KafkaSink self-defined RoundRobinPartitioner not able to discover new partitions

2024-07-30 Thread Lei Wang
I wrote a FlinkRoundRobinPartitioner extends FlinkKafkaPartitioner and use it as following: KafkaSink kafkaSink = KafkaSink.builder() .setBootstrapServers(sinkServers).setKafkaProducerConfig(props) .setDeliverGuarantee(DeliveryGuarantee.AT_LEAST_ONCE) .setRecordSerializer(KafkaRecordSerializationS

What the default partition assignment strategy for KafkaSourceBuilder

2024-07-02 Thread Lei Wang
A simple flink task that consumes a kafka topic message and does some calculation. The number of partitions of the topic is 48, I set the parallel also 48 and expect one parallel consumes one partition. But after submitting the task I found that there's 5 parallels consuming two partitions and 5 pa

Re:Re: Checkpoints and windows size

2024-06-19 Thread Feifan Wang
se. Maybe someone with more experience can give more valuable advice. > How can I find idle check point size of my project, I found below link but it > is not talking about parallelism. What do you mean "idle checkpoint size" ? —— Best regards, Feifan Wang 在

Re:Checkpoints and windows size

2024-06-19 Thread Feifan Wang
overhead, so this is a trade-off. My personal suggestion is to set the checkpoint interval to 5 minutes when using rocksdb incremental checkpoint. You can also make your own choice based on the impacts mentioned above. —— Best regards, Feifan Wang At 2024-06-19 12:08:57, "

Re: Problems with multiple sinks using postgres-cdc connector

2024-06-17 Thread Hongshun Wang
-> process1 -> process2 -> sink2 > `--> sink1 > > I get the errors described, where it appears that a second process is > created that attempts to use the current slot twice. > > On Mon, Jun 17, 2024 at 1:58 AM Hongshun Wang > wrote: > >> Hi David, >> >

Re: Problems with multiple sinks using postgres-cdc connector

2024-06-17 Thread Hongshun Wang
Hi David, > When I add this second sink, the postgres-cdc connector appears to add a second reader from the replication log, but with the same slot name. I don't understand what you mean by adding a second sink. Do they share the same source, or does each have a separate pipeline? If the former on

Re: Force to commit kafka offset when stop a job.

2024-06-11 Thread Lei Wang
; Zhanghao Chen > ------ > *From:* Lei Wang > *Sent:* Thursday, June 6, 2024 16:54 > *To:* Zhanghao Chen ; ruanhang1...@gmail.com < > ruanhang1...@gmail.com> > *Cc:* user > *Subject:* Re: Force to commit kafka offset when stop a job. > > Thanks Zhanghao && Hang. >

Re: Force to commit kafka offset when stop a job.

2024-06-06 Thread Lei Wang
avepoint [1]. Flink which will > trigger a final offset commit on the final savepoint. > > [1] > https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/cli/#stopping-a-job-gracefully-creating-a-final-savepoint > > Best, > Zhanghao Chen > ---

Force to commit kafka offset when stop a job.

2024-06-05 Thread Lei Wang
When stopping a flink job that consuming kafka message, how to force it to commit kafka offset Thanks, Lei

Re: problem with the heartbeat interval feature

2024-05-18 Thread Hongshun Wang
/flink-cdc-base/src/main/java/org/apache/flink/cdc/connectors/base/source/reader/IncrementalSourceReaderWithCommit.java#L93 On Sat, May 18, 2024 at 12:34 AM Thomas Peyric wrote: > thanks Hongshun for your response ! > > Le ven. 17 mai 2024 à 07:51, Hongshun Wang a > écrit : > >>

Re: problem with the heartbeat interval feature

2024-05-16 Thread Hongshun Wang
Hi Thomas, In debezium dos says: For the connector to detect and process events from a heartbeat table, you must add the table to the PostgreSQL publication specified by the publication.name

Re: Why RocksDB metrics cache-usage is larger than cache-capacity

2024-04-23 Thread Lei Wang
u share the related rocksdb log which > may contain more detailed info ? > > On Fri, Apr 12, 2024 at 12:49 PM Lei Wang wrote: > >> >> I enable RocksDB native metrics and do some performance tuning. >> >> state.backend.rocksdb.block.cache-size is set to 128m,4 sl

Why RocksDB metrics cache-usage is larger than cache-capacity

2024-04-11 Thread Lei Wang
I enable RocksDB native metrics and do some performance tuning. state.backend.rocksdb.block.cache-size is set to 128m,4 slots for each TaskManager. The observed result for one specific parallel slot: state.backend.rocksdb.metrics.block-cache-capacity is about 14.5M state.backend.rocksdb.metric

Re: How to enable RocksDB native metrics?

2024-04-11 Thread Lei Wang
Thanks very much, it finally works On Thu, Apr 11, 2024 at 8:27 PM Zhanghao Chen wrote: > Add a space between -yD and the param should do the trick. > > Best, > Zhanghao Chen > -- > *From:* Lei Wang > *Sent:* Thursday, April 11, 2024 19:40

Re: How to enable RocksDB native metrics?

2024-04-11 Thread Lei Wang
old-styled CLI for YARN jobs where "-yD" instead of "-D" > should be used. > -- > *From:* Lei Wang > *Sent:* Thursday, April 11, 2024 12:39 > *To:* Biao Geng > *Cc:* user > *Subject:* Re: How to enable RocksDB native metrics?

Re: Optimize exact deduplication for tens of billions data per day

2024-04-10 Thread Lei Wang
Hi Peter, I tried,this improved performance significantly,but i don't know exactly why. According to what i know, the number of keys in RocksDB doesn't decrease. Any specific technical material about this? Thanks, Lei On Fri, Mar 29, 2024 at 9:49 PM Lei Wang wrote: > Perhaps

Re: How to enable RocksDB native metrics?

2024-04-10 Thread Lei Wang
ig/#rocksdb-native-metrics> >> <https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/deployment/config/#rocksdb-native-metrics> >> >> Sent from my iPhone >> >> On Apr 7, 2024, at 6:03 PM, Lei Wang wrote: >> >>  >> I want to en

Re: How to enable RocksDB native metrics?

2024-04-10 Thread Lei Wang
cs> >> [image: favicon.png] >> <https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/deployment/config/#rocksdb-native-metrics> >> <https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/deployment/config/#rocksdb-native-metrics> >> >>

Found an issue when using Flink 1.19's AsyncScalarFunction

2024-04-09 Thread Xiaolong Wang
Hi, I found a ClassNotFound exception when using Flink 1.19's AsyncScalarFunction. Stack trace: Caused by: java.lang.ClassNotFoundException: > org.apache.commons.text.StringSubstitutor > > at java.net.URLClassLoader.findClass(Unknown Source) ~[?:?] > > at java.lang.ClassLoader.loadClass(Unknown

Re: How to enable RocksDB native metrics?

2024-04-07 Thread Lei Wang
he.org/flink/flink-docs-release-1.19/docs/deployment/config/#rocksdb-native-metrics> > > Sent from my iPhone > > On Apr 7, 2024, at 6:03 PM, Lei Wang wrote: > >  > I want to enable it only for specified jobs, how can I specify the > configurations on cmd line when sub

Re: How to enable RocksDB native metrics?

2024-04-07 Thread Lei Wang
; https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#rocksdb-native-metrics >> (RocksDB Native Metrics) >> >> >> Best, >> Zakelly >> >> On Sun, Apr 7, 2024 at 4:47 PM Lei Wang wrote: >> >>> >>> U

How to enable RocksDB native metrics?

2024-04-06 Thread Lei Wang
Using big state and want to do some performance tuning, how can i enable RocksDB native metrics? I am using Flink 1.14.4 Thanks, Lei

Re: Optimize exact deduplication for tens of billions data per day

2024-03-29 Thread Lei Wang
hope this helps, > Peter > > On Fri, Mar 29, 2024, 09:08 Lei Wang wrote: > >> >> Use RocksDBBackend to store whether the element appeared within the last >> one day, here is the code: >> >> *public class DedupFunction extends KeyedProcessFunction {*

Optimize exact deduplication for tens of billions data per day

2024-03-28 Thread Lei Wang
Use RocksDBBackend to store whether the element appeared within the last one day, here is the code: *public class DedupFunction extends KeyedProcessFunction {* *private ValueState isExist;* *public void open(Configuration parameters) throws Exception {* *ValueStateDescriptor de

Re:Understanding checkpoint/savepoint storage requirements

2024-03-27 Thread Feifan Wang
Hi Robert : Your understanding are right ! Add some more information : JobManager not only responsible for cleaning old checkpoints, but also needs to write metadata file to checkpoint storage after all taskmanagers have taken snapshots. --- Best Feifan Wang

Re: Understanding RocksDBStateBackend in Flink on Yarn on AWS EMR

2024-03-26 Thread Yang Wang
Usually, you should use the HDFS nameservice instead of the NameNode hostname:port to avoid NN failover. And you could find the supported nameservice in the hdfs-site.xml in the key *dfs.nameservices*. Best, Yang On Fri, Mar 22, 2024 at 8:33 PM Sachin Mittal wrote: > So, when we create an EMR

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-20 Thread Huajie Wang
Congratulations Best, Huajie Wang Leonard Xu 于2024年3月20日周三 21:36写道: > Hi devs and users, > > We are thrilled to announce that the donation of Flink CDC as a > sub-project of Apache Flink has completed. We invite you to explore the new > resources available: > > - Git

Re: Jobmanager restart after it has been requested to stop

2024-02-02 Thread Yang Wang
If you could find the "Deregistering Flink Kubernetes cluster, clusterId" in the JobManager log, then it is not the expected behavior. Having the full logs of JobManager Pod before restarted will help a lot. Best, Yang On Fri, Feb 2, 2024 at 1:26 PM Liting Liu (litiliu) via user < user@flink.a

Re: [DISCUSS] Hadoop 2 vs Hadoop 3 usage

2024-01-15 Thread Yang Wang
I could share some metrics about Alibaba Cloud EMR clusters. The ratio of Hadoop2 VS Hadoop3 is 1:3. Best, Yang On Thu, Dec 28, 2023 at 8:16 PM Martijn Visser wrote: > Hi all, > > I want to get some insights on how many users are still using Hadoop 2 > vs how many users are using Hadoop 3. Fli

Re: Flink HA with Zookeeper and Docker Compose: unable to startup a working setup.

2024-01-15 Thread Yang Wang
Could you please configure the same HA configurations for TaskManager as well? It seems that the TaskManager container does not use a correct URL when contacting with ResourceManager. Best, Yang On Fri, Dec 29, 2023 at 11:13 PM Alessio Bernesco Làvore < alessio.berne...@gmail.com> wrote: > Hell

Re: Deploying the K8S operator sample on GKE Autopilot : Association with remote system [akka.tcp://flink@basic-example.default:6123] has failed,

2024-01-15 Thread Yang Wang
Could you please directly use the JobManager Pod IP address instead of K8s service name(basic-example.default) and have a try with curl/wget? It seems that the JobManager K8s service could not be accessed. Best, Yang On Sat, Jan 13, 2024 at 1:24 AM LINZ, Arnaud wrote: > Hi, > > Some more tests

Re: Flink Kubernetes HA

2024-01-15 Thread Yang Wang
The fabric8 K8s client is using PATCH to replace get-and-update in v6.6.2. That's why you also need to give PATCH permission for the K8s service account. This would help to decrease the pressure of K8s APIServer. You could find more information here[1]. [1]. https://issues.apache.org/jira/browse/F

[flink-k8s-connector] In-place scaling up often takes several times till it succeeds.

2023-12-06 Thread Xiaolong Wang
Hi, I'm playing with a Flink 1.18 demo with the auto-scaler and the adaptive scheduler. The operator can correctly collect data and order the job to scale up, but it'll take the job several times to reach the required parallelism. E.g. The original parallelism for each vertex is something like b

Re: Flink Kubernetes operator keeps reporting REST client timeout.

2023-11-20 Thread Xiaolong Wang
Seems the operator didn't get restarted automatically after the configmap is changed. After a roll-out restart, the exception disappeared. Never mind this issue. Thanks. On Tue, Nov 21, 2023 at 11:31 AM Xiaolong Wang wrote: > Hi, > > Recently I upgraded the flink-kubernetes-opera

Flink Kubernetes operator keeps reporting REST client timeout.

2023-11-20 Thread Xiaolong Wang
Hi, Recently I upgraded the flink-kubernetes-operator from 1.4.0 to 1.6.1 to use Flink 1.18. After that, the operator kept reporting the following exception: 2023-11-21 03:26:50,505 o.a.f.k.o.r.d.AbstractFlinkResourceReconciler [INFO > ][sn-push/sn-push-decision-maker-log-s3-hive-prd] Resource fu

Re:Custom Metrics not showing in prometheus

2023-09-18 Thread Matt Wang
see if there is any error information about the metrics; [1] https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#metrics-reporter-%3Cname%3E-filter-excludes -- Best, Matt Wang Replied Message | From | patricia lee | | Date | 09/18/2023 16:58 | | To

Re: Flink K8S operator does not support IPv6

2023-09-05 Thread Xiaolong Wang
FYI, adding environment variables of ` KUBERNETES_DISABLE_HOSTNAME_VERIFICATION=true` works for me. This env variable needs to be added to both the Flink operator and the Flink job definition. On Tue, Aug 8, 2023 at 12:03 PM Xiaolong Wang wrote: > Ok, thank you. > > On Tue, Aug 8, 2

Re: Flink K8S operator does not support IPv6

2023-08-07 Thread Xiaolong Wang
Ok, thank you. On Tue, Aug 8, 2023 at 11:22 AM Peter Huang wrote: > We will handle it asap. Please check the status of this jira > https://issues.apache.org/jira/browse/FLINK-32777 > > On Mon, Aug 7, 2023 at 8:08 PM Xiaolong Wang > wrote: > >> Hi, >> >> I w

Flink K8S operator does not support IPv6

2023-08-07 Thread Xiaolong Wang
Hi, I was testing flink-kubernetes-operator in an IPv6 cluster and found out the below issues: *Caused by: javax.net.ssl.SSLPeerUnverifiedException: Hostname > fd70:e66a:970d::1 not verified:* > > *certificate: sha256/EmX0EhNn75iJO353Pi+1rClwZyVLe55HN3l5goaneKQ=* > > *DN: CN=kube-apiserve

[Bug-report]Flink-operator 1.6.0 repo does not exist yet

2023-08-02 Thread Xiaolong Wang
Hi, I noticed that the newest documentation of the flink-operator has pointed to v1.6.0, yet when using the `helm repo add flink-operator-repo https://downloads.apache.org/flink/flink-kubernetes-operator-1.6.0/` command to install, it turns out that the given URL does not exist. I suppose that 1.

Re: Parallelism under reactive scaling with slot sharing groups

2023-07-31 Thread Allen Wang
y > using the master branch. > > > Best, > Weihua > > > On Tue, Jul 25, 2023 at 2:56 AM Allen Wang wrote: > >> Hello, >> >> Our job has operators of source -> sink -> global committer. We have >> created two slot sharing groups, one for source

Parallelism under reactive scaling with slot sharing groups

2023-07-24 Thread Allen Wang
Hello, Our job has operators of source -> sink -> global committer. We have created two slot sharing groups, one for source and sink and one for global committer. The global committer has specified max parallelism of 1. No max parallelism set with the source/sink while there is a system level defa

Re: How to resume a job from checkpoint with the SQL gateway.

2023-07-18 Thread Xiaolong Wang
able/sqlclient/#terminating-a-job > > Best, > Shammon FY > > > On Tue, Jul 18, 2023 at 9:56 AM Xiaolong Wang > wrote: > >> Hi, Shammon, >> >> I know that the job manager can auto-recover via HA configurations, but >> what if I want to upgrade the running F

Unsubscribe

2023-07-17 Thread wang
Unsubscribe

Unsubscribe

2023-07-16 Thread William Wang

Hadoop Error on ECS Fargate

2023-07-13 Thread Wang, Mengxi X via user
till the errors persist. Can anybody help please? Best wishes, Mengxi Wang This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, malicious content and monitoring of electronic messages.

How to resume a job from checkpoint with the SQL gateway.

2023-07-12 Thread Xiaolong Wang
Hi, I'm currently working on providing a SQL gateway to submit both streaming and batch queries. My question is, if a streaming SQL is submitted and then the jobmanager crashes, is it possible to resume the streaming SQL from the latest checkpoint with the SQL gateway ?

Unsubscribe

2023-07-12 Thread wang
Unsubscribe

Part files generated in reactive mode

2023-07-04 Thread Wang, Mengxi X via user
Hi, We want to process one 2GB file and the output should also be a single 2GB file, but after we enabled reactive mode it generated several hundred small output files instead of one 2GB file. Can anybody help please? Best wishes, Mengxi Wang This message is confidential and subject to terms

SQL-gateway Failed to Run

2023-07-02 Thread Xiaolong Wang
Hi, I've tested the Flink SQL-gateway to run some simple Hive queries but met some exceptions. Environment Description: Run on : Kubernetes Deployment Mode: Session Mode (created by a flink-kubernetes-operator) Steps to run: 1. Apply a `flinkdeployment` of flink session cluster to flink operator

Re: Default Log4j properties in Native Kubernetes

2023-06-20 Thread Yang Wang
I assume you are using "*bin/flink run-application*" to submit a Flink application to K8s cluster. Then you could simply update your local log4j-console.properties, it will be shipped and mounted to JobManager/TaskManager pods via ConfigMap. Best, Yang Vladislav Keda 于2023年6月20日周二 22:15写道: > Hi

Whether Flink SQL window operations support "Allow Lateness and SideOutput"?

2023-02-20 Thread wang
Hi dear engineers, One question as title: Whether Flink SQL window operations support "Allow Lateness and SideOutput"? Just as supported in Datastream api (allowedLateness and sideOutputLateData) like: SingleOutputStreamOperator<>sumStream = dataStream.keyBy().timeWindow()

Re: "Error while retrieving the leader gateway" when using Kubernetes HA

2023-01-30 Thread Yang Wang
I assume you are using the standalone mode. Right? For the native K8s mode, the leader address should be *akka.tcp://flink@JM_POD_IP:6123/user/rpc/dispatcher_1 *when HA enabled. Best, Yang Anton Ippolitov via user 于2023年1月31日周二 00:21写道: > This is actually what I'm already doing, I'm only sett

Re: Apache Beam MinimalWordCount on Flink on Kubernetes using Flink Kubernetes Operator on GCP

2023-01-17 Thread Yang Wang
The "JAR file does not exist" exception comes from the JobManager side, not on the client. Please be aware that the local:// scheme in the jarURI means the path in the JobManager pod. You could use an init-container to download your user jar and mount it to the JobManager main-container. Refer to

Re: Supplying jar stored at S3 to flink to run the job in kubernetes

2023-01-16 Thread Yang Wang
Do you build your own flink-kubernetes-operator image with the flink-s3-fs plugin bundled[1]? [1]. https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.3/docs/custom-resource/overview/#flinksessionjob-spec-overview Best, Yang Weihua Hu 于2023年1月17日周二 10:47写道: > Hi, Rahul

Re: Flink Job Manager Recovery from EKS Node Terminations

2023-01-11 Thread Yang Wang
First, JobManager does not store any persistent data to local when the Kubernetes HA + S3 used. It means that you do not need to mount a PV for JobMananger deployment. Secondly, node failures or terminations should not cause the CrashLoopBackOff status. One possible reason I could imagine is a bug

Re: The use of zookeeper in flink

2023-01-03 Thread Yang Wang
The reason why the running jobs try to failover with zookeeper outage is that the JobManager lost leadership. Having a standby JobManager or not makes no difference. Best, Yang Matthias Pohl via user 于2023年1月2日周一 20:51写道: > And I screwed up the reply again. -.- Here's my previous response for t

Re: How to get failed streaming Flink job log in Flink Native K8s mode?

2023-01-03 Thread Yang Wang
I think you might need a sidecar container or daemonset to collect the Flink logs and store into a persistent storage. You could find more information here[1]. [1]. https://www.alibabacloud.com/blog/best-practices-of-kubernetes-log-collection_596356 Best, Yang hjw 于2022年12月22日周四 23:28写道: > On

Re: Stand alone K8s HA mode with Static Tokens Used by Service Accounts

2022-11-24 Thread Yang Wang
IIUC, the fabric8 Kubernetes-client 5.5.0 should already support to reload the latest kube config if received 401 error. Refer to the following PR[1] for more information. Please share your feedback here if it still could not work. [1]. https://github.com/fabric8io/kubernetes-client/pull/2731 Be

Re: support escaping `#` in flink job spec in Flink-operator

2022-11-08 Thread Yang Wang
This is a known limit of the current Flink options parser. Refer to FLINK-15358[1] for more information. [1]. https://issues.apache.org/jira/browse/FLINK-15358 Best, Yang Gyula Fóra 于2022年11月8日周二 14:41写道: > It is also possible that this is a problem of the Flink native Kubernetes > integration

Re: [DISCUSS ] add --jars to support users dependencies jars.

2022-10-27 Thread Yang Wang
Thanks Jacky Lau for starting this discussion. I understand that you are trying to find a convenient way to specify dependency jars along with user jar. However, let's try to narrow down by differentiating deployment modes. # Standalone mode No matter you are using the standalone mode on virtual

Re: configMap value error when using flink-operator?

2022-10-26 Thread Yang Wang
Maybe we could change the values of *taskmanager.numberOfTaskSlots* and *parallelism.default *in flink-conf.yaml of Kubernetes operator to 1, which are aligned with the default values in Flink codebase. Best, Yang Gyula Fóra 于2022年10月26日周三 15:17写道: > Hi! > > I agree that this might be confusin

Re: Flink Native K8S RBAC

2022-10-20 Thread Yang Wang
I have created a ticket[1] to fill the missing part in the native K8s documentation. [1]. https://issues.apache.org/jira/browse/FLINK-29705 Best, Yang Gyula Fóra 于2022年10月20日周四 13:37写道: > Hi! > > As a reference you can look at how the Flink Kubernetes Operator manages > RBAC settings: > > > ht

Re: Activate Flink HA without checkpoints on k8S

2022-10-19 Thread Yang Wang
Add some more information to Gyula's comment. For application mode without checkpoint, you do not need to activate the HA since it will not take any effect and the Flink job will be submitted again after the JobManager restarted. Because the job submission happens on the JobManager side. For sess

Watermark generating mechanism in Flink SQL

2022-10-17 Thread wang
Hi dear engineers, I have one question about watermark generating mechanism in Flink SQL. There are two mechanisms called Periodic Watermarks and Punctuated Watermarks, I want to use Periodic Watermarks with interval 5 seconds (meaning watermarks will be generated every 5 seconds), how should

Re: fail to mount hadoop-config-volume when using flink-k8s-operator

2022-10-13 Thread Yang Wang
Currently, exporting the env "HADOOP_CONF_DIR" could only work for native K8s integration. The flink client will try to create the hadoop-config-volume automatically if hadoop env found. If you want to set the HADOOP_CONF_DIR in the docker image, please also make sure the specified hadoop conf dir

Re: serviceAccount permissions issue for high availability in operator 1.1

2022-09-20 Thread Yang Wang
ay to change that to use standalone K8s? I haven't > seen anything about that in the docs, besides a mention that standalone > support is coming in version 1.2 of the operator. > > Thanks, > > Javier > > > On Thu, Sep 8, 2022, 22:50 Yang Wang wrote: > >> Si

Re: serviceAccount permissions issue for high availability in operator 1.1

2022-09-08 Thread Yang Wang
Since the flink-kubernetes-operator is using native K8s integration[1] by default, you need to give the permissions of pod and deployment as well as ConfigMap. You could find more information about the RBAC here[2]. [1]. https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/r

Re: [Flink 1.15.1 - Application mode native k8s Exception] - Exception occurred while acquiring lock 'ConfigMapLock

2022-09-08 Thread Yang Wang
t should be a warning then > > What about the 1st error we encountered regarding the kube/config file > exception? > > > Thank you so much, > Best, > Tamir > > -- > *From:* Yang Wang > *Sent:* Thursday, September 8, 2022 7:08 AM >

Re: [Flink Kubernetes Operator] FlinkSessionJob crd spec jarURI

2022-09-08 Thread Yang Wang
Given that the "local://" schema means the jar is available in the image/container of JobManager, so it could only be supported in the K8s application mode. If you configure the jarURI to "file://" schema for session cluster, it means that this jar file should be available in the flink-kubernetes-

Re: Deploying Jobmanager on k8s as a Deployment

2022-09-07 Thread Yang Wang
For native K8s integration, the Flink ResourceManager will delete the JobManager K8s deployment as well as the HA data once the job reached a globally terminal state. However, it is indeed a problem for standalone mode since the JobManager will be restarted again even the job has finished. I think

Re: [Flink 1.15.1 - Application mode native k8s Exception] - Exception occurred while acquiring lock 'ConfigMapLock

2022-09-07 Thread Yang Wang
"data-agg-events-insertion-cluster-config-map" already > exists. > > Log file is enclosed. > > Thanks, > Tamir. > > -- > *From:* Yang Wang > *Sent:* Monday, September 5, 2022 3:03 PM > *To:* Tamir Sagi > *Cc:* user@flink.apache.org ; Lihi Peretz < &g

Re: [Flink 1.15.1 - Application mode native k8s Exception] - Exception occurred while acquiring lock 'ConfigMapLock

2022-09-05 Thread Yang Wang
Could you please check whether the "kubernetes.config.file" is configured to /opt/flink/.kube/config in the Flink configmap? It should be removed before creating the Flink configmap. Best, Yang Tamir Sagi 于2022年9月4日周日 18:08写道: > Hey All, > > We recently updated to Flink 1.15.1. We deploy stream

Re: How to open a Prometheus metrics port on the rest service when using the Kubernetes operator?

2022-09-05 Thread Yang Wang
I do not think we could add an additional port to the rest service since it is created by Flink internally. Actually, I do not suggest scrapping the metrics from rest service. Instead, the port in the pod needs to be used. Because the metrics might not work correctly if multiple JobManagers are ru

Re: [E] Re: Kubernetes operator expose UI rest service as NodePort instead of default clusterIP

2022-09-05 Thread Yang Wang
ed to "ip" might work. Let me try that. I >>> believe there should be a reason to always override the >>> "REST_SERVICE_EXPOSED_TYPE" to "ClusterIP". >>> >>> [1] https://docs.aws.amazon.com/eks/latest/userguide/alb-ingress.html >

Re: Kubernetes operator expose UI rest service as NodePort instead of default clusterIP

2022-09-01 Thread Yang Wang
I am afraid the current flink-kubernetes-operator always overrides the "REST_SERVICE_EXPOSED_TYPE" to "ClusterIP". Could you please share why the ingress[1] could not meet your requirements? Compared with NodePort, I think it is a more graceful implementation. [1]. https://nightlies.apache.org/fli

Re: Error when run test case in Windows

2022-08-22 Thread Yang Wang
It is caused by the following assert. Maybe we could *File.pathSeparator* instead of "/". *assertThat(optional.get()).isEqualTo(hadoopHome + "/conf");* Would you like to create a ticket and attach a PR for this issue? Best, Yang hjw <1010445...@qq.com> 于2022年8月21日周日 19:44写道: > When I run mvn c

Re: Flink running same task on different Task Manager

2022-08-18 Thread Lijie Wang
lism to 12 for the task2 (this real-time task needs to >> read from 12 different Kafka partitions hence setting it to 12) >> 3. set parallelism of task1 to 2 >> 4. then set this cluster.evenly-spread-out-slots: true >> >> Will these methods work? Also, I did not find a

Does flink sql support UDTAGG

2022-08-07 Thread wang
Hi dear engineers, One small question: does flink sql support UDTAGG? (user-defined table aggregate function), seems only supported in flink table api? If not supported in flink sql, how can I define an aggregated udf which could output multiple rows to kafka. Thanks for your help! Rega

Re: Flink task lifecycle listener/hook/SPI

2022-08-03 Thread Lijie Wang
Hi Allen, >From my experience, you can do your init setup by the following 2 ways: 1. Do your init setup in RichFunction#open method, see [1] for details. 2. Do your init setup in static block, it will be executed when the class is loaded. [1] https://nightlies.apache.org/flink/flink-docs-master/

Re: Migration to application mode

2022-08-01 Thread Lijie Wang
Hi, I think the difference between ApplicationMode and PerJob is just where the main method is executed (ApplicationMode executes on JM, PerJob executes on client side). So I think your original job code should work well under ApplicationMode. Did you encounter any problems? You can get more detail

Re: Issues with Flink scheduler?

2022-07-31 Thread Lijie Wang
Hi, Which version are you using? Has any job failover occurred? It would be better if you can provide the full log of JM. Best, Lijie Hemanga Borah 于2022年8月1日周一 01:47写道: > Hello guys, > We have been seeing an issue with our Flink applications. Our > applications run fine for several hours, an

Re: Flink Operator Resources Requests and Limits

2022-07-27 Thread Yang Wang
We have the *kubernetes.jobmanager.cpu.limit-factor* and *kubernetes.jobmanager.memory.limit-factor* to control the limit value. The resources limit memory will be set to memory/cpu * limit-factor. Best, Yang PACE, JAMES 于2022年7月28日周四 01:26写道: > That does not seem to work. > > > > For instanc

Re: NodePort conflict for multiple HA application-mode standalone Kubernetes deploys in same namespace

2022-07-24 Thread Yang Wang
Removing the nodePort for every different Flink application is necessary so that it could pick up a random port. Moreover, I believe you also need to change some other yamls. For example, having a different name for JobManager/TaskManager yamls, update the jobmanager-service.yaml and flink-configu

Re: [ANNOUNCE] Apache Flink Kubernetes Operator 1.1.0 released

2022-07-24 Thread Yang Wang
Congrats! Thanks Gyula for driving this release, and thanks to all contributors! Best, Yang Gyula Fóra 于2022年7月25日周一 10:44写道: > The Apache Flink community is very happy to announce the release of Apache > Flink Kubernetes Operator 1.1.0. > > The Flink Kubernetes Operator allows users to manage

Re: standalone mode support in the kubernetes operator (FLIP-25)

2022-07-18 Thread Yang Wang
ster? > What is the advantag doing so. > > Yang Wang 于2022年7月14日周四 10:55写道: > > > > I think the standalone mode support is expected to be done in the > version 1.2.0[1], which will be released on Oct 1 (ETA). > > > > [1]. > https://cwiki.apache.org/confluence/di

Re: standalone mode support in the kubernetes operator (FLIP-25)

2022-07-13 Thread Yang Wang
I think the standalone mode support is expected to be done in the version 1.2.0[1], which will be released on Oct 1 (ETA). [1]. https://cwiki.apache.org/confluence/display/FLINK/Release+Schedule+and+Planning Best, Yang Javier Vegas 于2022年7月14日周四 06:25写道: > Hello! The operator docs > https://n

Re: Flink running same task on different Task Manager

2022-07-13 Thread Lijie Wang
lure > ? > > On Wed, Jun 15, 2022 at 6:13 PM Lijie Wang > wrote: > >> Hi Great, >> >> Do you mean there is a Task1 and a Task2 on each task manager? >> >> If so, I think you can set Task1 and Task2 to the same parallelism and >> set them in the same

Re: DataStream.keyBy() with keys determined at run time

2022-07-11 Thread Thomas Wang
Tuple0 to Tuple25. > > > > On Sun, Jul 10, 2022 at 5:43 PM Thomas Wang wrote: > >> > >> I didn't copy the exact error message, but basically the idea of the > error message is that I cannot use the abstract class Tuple and instead, I > should use Tuple1, T

Re: Is it possible to save Table to CSV?

2022-07-11 Thread Lijie Wang
You can use the FileSink and set the format to csv. An example of FileSink: https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/filesystem/#full-example Best, Lijie 于2022年7月11日周一 16:16写道: > > If I create dynamic table with: > > > CREATE TABLE some_table (name STRING, scor

Re: DataStream.keyBy() with keys determined at run time

2022-07-10 Thread Thomas Wang
10, 2022 at 6:30 AM Thomas Wang wrote: > >> Hi, >> >> I have a use case where I need to call DataStream.keyBy() with keys >> loaded from a configuration. The number of keys and their data types are >> variables and is determined by the configuration. Once the con

DataStream.keyBy() with keys determined at run time

2022-07-10 Thread Thomas Wang
Hi, I have a use case where I need to call DataStream.keyBy() with keys loaded from a configuration. The number of keys and their data types are variables and is determined by the configuration. Once the configuration is loaded, they won't change. I'm trying to use the following key selector, but

  1   2   3   4   5   6   7   >