Re: Questions about using Flink MongoDB CDC

2024-11-17 Thread Jiabao Sun
/docs/connectors/flink-sources/mongodb-cdc/#full-changeloga-namefull-changelog-id003-a wangye...@yeah.net 于2024年11月16日周六 21:26写道: > Hi all: > While using Flink with MongoDB CDC, I've noticed that my Flink job > causes MongoDB's memory usage to continuously increase. Below,

Questions about using Flink MongoDB CDC

2024-11-16 Thread wangye...@yeah.net
Hi all: While using Flink with MongoDB CDC, I've noticed that my Flink job causes MongoDB's memory usage to continuously increase. Below, I will detail the specific scenario to help identify where the issue lies. 1. MongoDB deployment architecture: sharded. 2. The memor

Re:Re: Found an issue when using Flink 1.19's AsyncScalarFunction

2024-04-15 Thread Xuyang
xt" with version "1.10.0" manually to check if it works? If you can work around this bug by this way, I think we should open an bug issue for it. -- Best! Xuyang At 2024-04-09 18:11:27, "Xiaolong Wang" wrote: Hi, I found a ClassNotFound exception w

Re:Found an issue when using Flink 1.19's AsyncScalarFunction

2024-04-09 Thread Xuyang
s bug by this way, I think we should open an bug issue for it. -- Best! Xuyang At 2024-04-09 18:11:27, "Xiaolong Wang" wrote: Hi, I found a ClassNotFound exception when using Flink 1.19's AsyncScalarFunction. Stack trace: Caused b

Found an issue when using Flink 1.19's AsyncScalarFunction

2024-04-09 Thread Xiaolong Wang
Hi, I found a ClassNotFound exception when using Flink 1.19's AsyncScalarFunction. Stack trace: Caused by: java.lang.ClassNotFoundException: > org.apache.commons.text.StringSubstitutor > > at java.net.URLClassLoader.findClass(Unknown Source) ~[?:?] > > at java.lang.ClassLoad

High latency in reading Iceberg tables using Flink table api

2024-03-12 Thread Chetas Joshi
Hello all, I am using the flink-iceberg-runtime lib to read an iceberg table into a Flink datastream. I am using Glue as the catalog. I use the flink table API to build and query an iceberg table and then use toDataStream to convert it into a DataStream. Here is the code Table table = streamTable

Re: Request to provide sample codes on Data stream using flink, Spring Boot & kafka

2024-02-08 Thread David Anderson
For a collection of several complete sample applications using Flink with Kafka, see https://github.com/confluentinc/flink-cookbook. And I agree with Marco -- in fact, I would go farther, and say that using Spring Boot with Flink is an anti-pattern. David On Wed, Feb 7, 2024 at 4:37 PM Marco

Re: Request to provide sample codes on Data stream using flink, Spring Boot & kafka

2024-02-07 Thread Marco Villalobos
Hi Nida, You can find sample code for using Kafka here: https://kafka.apache.org/documentation/ You can find sample code for using Flink here: https://nightlies.apache.org/flink/flink-docs-stable/ You can find sample code for using Flink with Kafka here: https://nightlies.apache.org/flink

Re: Request to provide sample codes on Data stream using flink, Spring Boot & kafka

2024-02-06 Thread Alexis Sarda-Espinosa
>: > Hi Team, > > I request you to provide sample codes on data streaming using flink, kafka > and spring boot. > > Awaiting your response. > > Thanks & Regards > Nida Shaikh >

Request to provide sample codes on Data stream using flink, Spring Boot & kafka

2024-02-06 Thread Fidea Lidea
Hi Team, I request you to provide sample codes on data streaming using flink, kafka and spring boot. Awaiting your response. Thanks & Regards Nida Shaikh

Re: How to monitor changes in the existing files using flink 1.17.2

2024-01-09 Thread Yu Chen
/file/src/impl/ContinuousFileSplitEnumerator.java#L62C33-L62C54 [2] Overview | Apache Paimon <https://paimon.apache.org/docs/master/concepts/overview/> Best, Yu Chen > 2024年1月10日 04:31,Nitin Saini 写道: > > Hi Flink Community, > > I was using flink 1.12.7 readFile to read fil

How to monitor changes in the existing files using flink 1.17.2

2024-01-09 Thread Nitin Saini
Hi Flink Community, I was using flink 1.12.7 readFile to read files from the s3 it was able to monitor if there are new files added or updation in the existing files as well. But now I have migrated to flink 1.17.2 and using FileSource to read files from s3 it was able to monitor if new files

Unable to locate full stderr logs while using Flink Operator

2023-12-07 Thread Edgar H
Hi all! I've just deployed an Apache Beam job using FlinkRunner in k8s and found that the job failed and has the following field: error: >- {"type":"org.apache.flink.util.SerializedThrowable","message":"org.apache.flink.client.program.ProgramInvocationException: The main method caused an

Having Problems as beginners using Flink in our Data Engineering internship

2023-11-29 Thread ALIZOUAOUI
Hey, I hope y'all are doing well, i started a project (4 months internship in Data Engineering) it's a totally new field for me and i have to present 2 months from now, i took with Java (i have no prior experience using it) and i started with Flink documentation, we have to follow the operation

Re: Handling default fields in Avro messages using Flink SQL

2023-11-13 Thread Hang Ruan
Hi, Dale. I think there are two choices to try. 1. As the reply in #22427[1], use the SQL function `COALESCE`. 2. Modify the code in Avro format by yourself. There is some work to do for the choice 2. First, you need to pass the default value in Schema, which does not contain the default value no

Handling default fields in Avro messages using Flink SQL

2023-11-13 Thread Dale Lane
I have a Kafka topic with events produced using an Avro schema like this: { "namespace": "demo.avro", "type": "record", "name": "MySimplifiedRecreate", "fields": [ { "name": "favouritePhrase", "type": "string", "default": "Hello World"

Re: Using Flink k8s operator on OKD

2023-10-05 Thread Gyula Fóra
t;>> resources: >>> - deployments >>> - deployments/finalizers >>> verbs: >>> - '*' >>> --- >>> apiVersion: rbac.authorization.k8s.io/v1 >>> kind: RoleBinding >>> metadata: >>> labels: >>&g

Re: Using Flink k8s operator on OKD

2023-10-05 Thread Krzysztof Chmielewski
/finalizers >> verbs: >> - '*' >> --- >> apiVersion: rbac.authorization.k8s.io/v1 >> kind: RoleBinding >> metadata: >> labels: >> app.kubernetes.io/name: flink-kubernetes-operator >> app.kubernetes.io/version: 1.5.0 >> name

Re: Using Flink k8s operator on OKD

2023-09-20 Thread Krzysztof Chmielewski
; name: flink-role-binding > roleRef: > apiGroup: rbac.authorization.k8s.io > kind: Role > name: flink > subjects: > - kind: ServiceAccount > name: flink > EOF > > Hopefully that helps. > > > On Tue, Sep 19, 2023 at 5:40 PM Krzysztof Chmielewski < >

Re: Using Flink k8s operator on OKD

2023-09-19 Thread Zach Lorimer
want to have Flink deployments in. kubectl apply -f - < wrote: > Hi community, > I was wondering if anyone tried to deploy Flink using Flink k8s operator > on machine where OKD [1] is installed? > > We have tried to install Flink k8s operator version 1.6 which seems to > succeed

Using Flink k8s operator on OKD

2023-09-19 Thread Krzysztof Chmielewski
Hi community, I was wondering if anyone tried to deploy Flink using Flink k8s operator on machine where OKD [1] is installed? We have tried to install Flink k8s operator version 1.6 which seems to succeed, however when we try to deploy simple Flink deployment we are getting an error. 2023-09-19

Re: Reading parquet files using Flink

2023-09-12 Thread liu ron
Hi, Are you using the DataStream API to read parquet file? Why not use Flink SQL to read the it? The ParquetRowInputFormat has been removed, you can use ParquetColumnarRowInputFormat in 1.17.1. Best, Ron Hou, Lijuan via user 于2023年9月12日周二 05:49写道: > Hi team, > > > > Is there any defined way t

Reading parquet files using Flink

2023-09-11 Thread Hou, Lijuan via user
Hi team, Is there any defined way to read Parquet files for flink 1.17.1? I did some search, and found this for

FeatHub : a feature store for ETL real-time features using Flink

2023-08-13 Thread Dong Lin
Dong Lin 于2023年8月14日 周一09:02写道: > Hi all, > > I am writing this email to promote our open-source feature store project ( > FeatHub <https://github.com/alibaba/feathub>) that supports using Flink > (production-ready) and Spark (not production-ready) to compute real-time / &g

ETL real-time features using Flink with application-level metrics

2023-08-13 Thread Dong Lin
Hi all, I am writing this email to promote our open-source feature store project ( FeatHub <https://github.com/alibaba/feathub>) that supports using Flink (production-ready) and Spark (not production-ready) to compute real-time / offline features with pythonic declarative feature specific

Failing to process timestamp data from Kafka + Debezium Avro using Flink SQL

2023-03-06 Thread Frank Lyaruu
Hi all, I'm trying to ingest change capture data data from Kafka which contains some timestamps. I'm using Flink SQL, and I'm running into issues, specifically with the created_at field. //In postgres, it is of type 'timestamptz'. My table definition is this: CREATE TA

Re: Apache Beam MinimalWordCount on Flink on Kubernetes using Flink Kubernetes Operator on GCP

2023-01-17 Thread Yang Wang
The "JAR file does not exist" exception comes from the JobManager side, not on the client. Please be aware that the local:// scheme in the jarURI means the path in the JobManager pod. You could use an init-container to download your user jar and mount it to the JobManager main-container. Refer to

Apache Beam MinimalWordCount on Flink on Kubernetes using Flink Kubernetes Operator on GCP

2023-01-17 Thread Lee Parayno
I have a Kubernetes cluster in GCP running the Flink Kubernetes Operator. I'm trying to package a project with the Apache Beam MinimalWordCount using the Flink Runner as a FlinkDeployment to the Kubernetes Cluster Job Docker image created with this Dockerfile: FROM flink ENV FLINK_CLASSPATH /op

Re: configMap value error when using flink-operator?

2022-10-26 Thread Gyula Fóra
Max sense, I will fix this! Gyula On Thu, Oct 27, 2022 at 4:09 AM Biao Geng wrote: > +1 to Yang's suggestion. > > Yang Wang 于2022年10月26日周三 20:00写道: > >> Maybe we could change the values of *taskmanager.numberOfTaskSlots* and >> *parallelism.default >> *in flink-conf.yaml of Kubernetes operato

Re: configMap value error when using flink-operator?

2022-10-26 Thread Biao Geng
+1 to Yang's suggestion. Yang Wang 于2022年10月26日周三 20:00写道: > Maybe we could change the values of *taskmanager.numberOfTaskSlots* and > *parallelism.default > *in flink-conf.yaml of Kubernetes operator to 1, which are aligned with > the default values in Flink codebase. > > > Best, > Yang > > Gy

Re: configMap value error when using flink-operator?

2022-10-26 Thread Yang Wang
Maybe we could change the values of *taskmanager.numberOfTaskSlots* and *parallelism.default *in flink-conf.yaml of Kubernetes operator to 1, which are aligned with the default values in Flink codebase. Best, Yang Gyula Fóra 于2022年10月26日周三 15:17写道: > Hi! > > I agree that this might be confusin

Re: configMap value error when using flink-operator?

2022-10-26 Thread Gyula Fóra
Hi! I agree that this might be confusing but let me explain what happened. In the operator you can define default flink configuration. Currently it is https://github.com/apache/flink-kubernetes-operator/blob/main/helm/flink-kubernetes-operator/conf/flink-conf.yaml It contains numberOfTaskSlots=2.

configMap value error when using flink-operator?

2022-10-25 Thread Liting Liu (litiliu)
hi:    I'm trying to deploy a flink job with flink-operaotor. The flink-operator's version is 1.2.0. And the yaml i use is here: apiVersion: flink.apache.org/v1beta1 kind: FlinkDeployment metadata: name: basic-example spec: image: flink:1.15 flinkVersion: v1_15 flinkConfiguration:

Re: Cannot run pyflink example using Flink CLI

2022-10-21 Thread Levan Huyen
Great, thanks! Kind regards, Levan Huyen On Fri, 21 Oct 2022 at 00:53, Biao Geng wrote: > You are right. > It contains the python package `pyflink` and some dependencies like py4j > and cloudpickle but does not contain all relevant dependencies(e.g. > `google.protobuf` as the error log shows, w

Re: Cannot run pyflink example using Flink CLI

2022-10-20 Thread Biao Geng
You are right. It contains the python package `pyflink` and some dependencies like py4j and cloudpickle but does not contain all relevant dependencies(e.g. `google.protobuf` as the error log shows, which I also reproduce in my own machine). Best, Biao Geng Levan Huyen 于2022年10月20日周四 19:53写道: >

Re: Cannot run pyflink example using Flink CLI

2022-10-20 Thread Levan Huyen
Thanks Biao. May I ask one more question: does the binary package on Apache site (e.g: https://archive.apache.org/dist/flink/flink-1.15.2) contain the python package `pyflink` and its dependencies? I guess the answer is no. Thanks and regards, Levan Huyen On Thu, 20 Oct 2022 at 18:13, Biao Geng

Re: Cannot run pyflink example using Flink CLI

2022-10-20 Thread Biao Geng
Hi Levan, Great to hear that your issue is resolved! For the follow-up question, I am not quite familiar with AWS EMR's configuration for flink but due to the error you attached, it looks like that pyflink may not ship some 'Google' dependencies in the Flink binary zip file and as a result, it will

Re: Cannot run pyflink example using Flink CLI

2022-10-19 Thread Levan Huyen
Hi Biao, Thanks for your help. That solved my issue. It turned out that in setup1 (in EMR), I got apache-flink installed, but the package (and its dependencies) are not in the directory `/usr/lib/python3.7/site-packages` (corresponding to the python binary in `/usr/bin/python3`). For some reason,

Re: Cannot run pyflink example using Flink CLI

2022-10-19 Thread Biao Geng
Hi Levan, For your setup1 & 2, it looks like the python environment is not ready. Have you tried python -m pip install apache-flink for the first 2 setups? For your setup3, as you are trying to use `flink run ...` command, it will try to connect to a launched flink cluster but I guess you did not

Cannot run pyflink example using Flink CLI

2022-10-19 Thread Levan Huyen
Hi, I'm new to PyFlink, and I couldn't run a basic example that shipped with Flink. This is the command I tried: ./bin/flink run -py examples/python/datastream/word_count.py Here below are the results I got with different setups: 1. On AWS EMR 6.8.0 (Flink 1.15.1): *Error: No module named 'goog

Re: fail to mount hadoop-config-volume when using flink-k8s-operator

2022-10-13 Thread Yang Wang
pods. Best, Yang Liting Liu (litiliu) 于2022年10月12日周三 16:11写道: > Hi, community: > I'm using flink-k8s-operator v1.2.0 to deploy flink job. And the > "HADOOP_CONF_DIR" environment variable was setted in the image that i > buiilded from flink:1.15. I found the taskman

fail to mount hadoop-config-volume when using flink-k8s-operator

2022-10-12 Thread Liting Liu (litiliu)
Hi, community: I'm using flink-k8s-operator v1.2.0 to deploy flink job. And the "HADOOP_CONF_DIR" environment variable was setted in the image that i buiilded from flink:1.15. I found the taskmanager pod was trying to mount a volume named "hadoop-config-volume"

回复: Re:Re: get NoSuchMethodError when using flink flink-sql-connector-hive-2.2.0_2.11-1.14.4.jar

2022-09-01 Thread Liting Liu (litiliu)
(litiliu) 主题: Re:Re: get NoSuchMethodError when using flink flink-sql-connector-hive-2.2.0_2.11-1.14.4.jar Hi, Liu. It seems that you may use other own jars and thay has the common-lang3 with other versions, which may cause the version conflict. My suggesstion is that you can shade this

Re:Re: get NoSuchMethodError when using flink flink-sql-connector-hive-2.2.0_2.11-1.14.4.jar

2022-08-31 Thread Xuyang
ig/#classloader-parent-first-patterns-default Best regards, Yuxia 发件人: "Liting Liu (litiliu)" 收件人: "User" 发送时间: 星期三, 2022年 8 月 31日 下午 5:14:35 主题: get NoSuchMethodError when using flink flink-sql-connector-hive-2.2.0_2.11-1.14.4.jar Hi, i got NoSuchMethodError when using

Re: get NoSuchMethodError when using flink flink-sql-connector-hive-2.2.0_2.11-1.14.4.jar

2022-08-31 Thread yuxia
ot;Liting Liu (litiliu)" 收件人: "User" 发送时间: 星期三, 2022年 8 月 31日 下午 5:14:35 主题: get NoSuchMethodError when using flink flink-sql-connector-hive-2.2.0_2.11-1.14.4.jar Hi, i got NoSuchMethodError when using flink flink-sql-connector-hive-2.2.

get NoSuchMethodError when using flink flink-sql-connector-hive-2.2.0_2.11-1.14.4.jar

2022-08-31 Thread Liting Liu (litiliu)
Hi, i got NoSuchMethodError when using flink flink-sql-connector-hive-2.2.0_2.11-1.14.4.jar. Exception in thread "main" org.apache.flink.table.client.SqlClientException: Unexpected exception. This is a bug. Please consider filing an issue.

Re: Unable to start job using Flink Operator

2022-07-28 Thread Márton Balassi
Hi Morgan Karl, Could you please post the logs of your operator pod? A possible explanation is that your operator might be stuck in an unhealthy state, hence the webhook can not be accessed. Best, Marton On Thu, Jul 28, 2022 at 5:37 PM Geldenhuys, Morgan Karl < morgan.geldenh...@tu-berlin.de> wr

Unable to start job using Flink Operator

2022-07-28 Thread Geldenhuys, Morgan Karl
Greetings all, I am attempting to start a flink job using the Flink oeprator (version 1.1) however am running into a problem. While attempting to create the deployment i receive the following error: Resource: "flink.apache.org/v1beta1, Resource=flinkdeployments", GroupVersionKind: "flink.apa

Re: multiple pipeline deployment using flink k8s operator

2022-06-01 Thread Yang Wang
The current application mode has the limitation that only one job could be submitted when HA enabled[1]. So a feasible solution is to use the session mode[2], it will be supported in the coming release-1.0.0. However, I am afraid it still could not satisfy your requirement "2 task managers (one pe

multiple pipeline deployment using flink k8s operator

2022-05-31 Thread Sigalit Eliazov
Hi all, we just started using the flink k8s operator to deploy our flink cluster. >From what we understand we are only able to start a flink cluster per job. So in our case when we have 2 jobs we have to create 2 different clusters. obviously we would prefer to deploy these 2 job which relate to th

Re: [External] Require help regarding possible issue/bug I'm facing while using Flink

2022-03-07 Thread Qingsheng Ren
Hi De Xun, Unfortunately MAP, ARRAY and ROW types are supported by Flink Parquet format only since Flink 1.15 (see FLINK-17782 [1], not released yet). You may want to upgrade Flink version to 1.15 once it is released, or make your own implementation based on the latest code on master branch fo

Re: Require help regarding possible issue/bug I'm facing while using Flink

2022-03-06 Thread Qingsheng Ren
Hi De Xun, I created an answer in the StackOverflow and hope it would be helpful. I’d like repost my answer here for the convenience of people in mailing lists. The first call of RowRowConverter::toInternal is an internal implementation for making a deep copy of the StreamRecord emitted by tab

Require help regarding possible issue/bug I'm facing while using Flink

2022-03-06 Thread Chia De Xun .
Greetings, I'm facing a difficult issue/bug while working with Flink. Would definitely appreciate some official expert help on this issue. I have posted my problem on StackOverflow , but have no

Re: Skewed Data when joining tables using Flink SQL

2022-01-07 Thread Anne Lai
t; Best, > Yun > > > [1] https://issues.apache.org/jira/browse/FLINK-20491 > > > --Original Mail -- > *Sender:*Anne Lai > *Send Date:*Fri Jan 7 17:20:58 2022 > *Recipients:*User > *Subject:*Skewed Data when joining tables using Flink SQL > >

Re: Skewed Data when joining tables using Flink SQL

2022-01-07 Thread Yun Gao
://issues.apache.org/jira/browse/FLINK-20491 --Original Mail -- Sender:Anne Lai Send Date:Fri Jan 7 17:20:58 2022 Recipients:User Subject:Skewed Data when joining tables using Flink SQL Hi, I have a Flink batch job that needs to join a large skewed table with a smaller

Skewed Data when joining tables using Flink SQL

2022-01-07 Thread Anne Lai
Hi, I have a Flink batch job that needs to join a large skewed table with a smaller table, and because records are not evenly distributed to each subtask, it always fails with a "too much data in partition" error. I explored using DataStream API to broadcast the smaller tables as a broadcast state

Re: using flink retract stream and rockdb, too many intermediateresult of values cause checkpoint too heavy to finish

2021-12-16 Thread vtygoss
lies.apache.org/flink/flink-docs-master/docs/dev/table/tuning/#local-global-aggregation vtygoss 于2021年12月13日周一 17:13写道: Hi, community! I meet a problem in the procedure of building a streaming production pipeline using Flink retract stream and hudi-hdfs/kafka as storage engine and rocksdb as

Re: using flink retract stream and rockdb, too many intermediate result of values cause checkpoint too heavy to finish

2021-12-15 Thread Arvid Heise
optimizations on it. If you enable mini-batch and > two-staged aggregations most job will meet their performance needs. > > [1] > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/tuning/#local-global-aggregation > > vtygoss 于2021年12月13日周一 17:13写道: > >> Hi,

Re: using flink retract stream and rockdb, too many intermediate result of values cause checkpoint too heavy to finish

2021-12-13 Thread Caizhi Weng
aming production > pipeline using Flink retract stream and hudi-hdfs/kafka as storage engine > and rocksdb as statebackend. > > > In my scenario, > > - During a patient's hospitalization, multiple measurements of vital signs > are recorded, including temperature, pulse,

using flink retract stream and rockdb, too many intermediate result of values cause checkpoint too heavy to finish

2021-12-13 Thread vtygoss
Hi, community! I meet a problem in the procedure of building a streaming production pipeline using Flink retract stream and hudi-hdfs/kafka as storage engine and rocksdb as statebackend. In my scenario, - During a patient's hospitalization, multiple measurements of vital sign

Re: How to mount PVC volumes using Flink Native Kubernetes ?

2021-09-17 Thread Yang Wang
Thanks Tao for providing your internal use case. I have create a ticket for this feature[1]. [1]. https://issues.apache.org/jira/browse/FLINK-24332 Best, Yang tao xiao 于2021年9月11日周六 上午10:18写道: > Thanks David for the tips. We have been running Flink with no performance > degradation observed i

Re: How to mount PVC volumes using Flink Native Kubernetes ?

2021-09-10 Thread tao xiao
Thanks David for the tips. We have been running Flink with no performance degradation observed in EMR (which is EBS attached) for more than 1 year therefore we believe the same performance can be applied in Kubernetes. On Sat, Sep 11, 2021 at 3:13 AM David Morávek wrote: > OT: Beware that even i

Re: How to mount PVC volumes using Flink Native Kubernetes ?

2021-09-10 Thread David Morávek
OT: Beware that even if you manage to solve this, EBS is replicated network storage, therefore rocksdb performance will be affected significantly. Best, D. On Fri 10. 9. 2021 at 16:19, tao xiao wrote: > The use case we have is to store the RocksDB sst files in EBS. The EC2 > instance type (m5)

Re: How to mount PVC volumes using Flink Native Kubernetes ?

2021-09-10 Thread tao xiao
The use case we have is to store the RocksDB sst files in EBS. The EC2 instance type (m5) we use doesn't provide local disk storage therefore EBS is the only option to store the local sst file. On Fri, Sep 10, 2021 at 7:10 PM Yang Wang wrote: > I am afraid Flink could not support creating dedica

Re: How to mount PVC volumes using Flink Native Kubernetes ?

2021-09-10 Thread Yang Wang
I am afraid Flink could not support creating dedicated PVC for each TaskManager pod now. But I think it might be a reasonable requirement. Could you please share why you need to mount a persistent volume claim per TaskManager? AFAIK, the TaskManager will be deleted once it fails. You expect the PV

How to mount PVC volumes using Flink Native Kubernetes ?

2021-09-09 Thread Xiaolong Wang
Hi, I'm facing a tough question. I want to start a Flink Native Kubernetes job with each of the task manager pod mounted with an aws-ebs PVC. The first thought is to use the pod-template file to do this, but it soon went to a dead end. Since the pod-template on each of the task manager pod i

Re: how to read data from hive table using flink sqlclent?

2021-08-24 Thread xm lian
I have added a new project to quickly reproduce this error: https://github.com/lianxmfor/flink-hive xm lian 于2021年8月24日周二 下午7:56写道: > 0 > > > I tried to read the data from hive table using the flink sql client as per > the flink documentation >

how to read data from hive table using flink sqlclent?

2021-08-24 Thread xm lian
0 I tried to read the data from hive table using the flink sql client as per the flink documentation but it failed. i can read the table meta inform

Upcoming Meetup talk on using Flink & Pinot - Pinot vs Elasticsearch, a Tale of Two PoCs

2021-06-18 Thread Ken Krugler
Hi all, Next Tuesday I’m be showing about how we use Flink + Pinot to provide a new and improved analytics engine for Adbeat <https://adbeat.com/>. https://www.meetup.com/apache-pinot/events/277817649/ — Ken PS - Some caveats - it’s mostly about Pinot and Elasticsearch, and we’re using

Re: Cannot start from savepoint using Flink 1.12 in standalone Kubernetes + Kubernetes HA

2020-12-29 Thread Yang Wang
This is a known issue. Please refer here[1] for more information. And it is already fixed in master and 1.12 branch. Also the next minor Flink release version(1.12.1) will include it. Maybe you could help to verify that. [1]. https://issues.apache.org/jira/browse/FLINK-20648 Best, Yang ChangZhu

Cannot start from savepoint using Flink 1.12 in standalone Kubernetes + Kubernetes HA

2020-12-29 Thread 陳昌倬
Hi, We cannot start job from savepoint (created by Flink 1.12, Standalone Kubernetes + zookeeper HA) in Flink 1.12, Standalone Kubernetes + Kubernetes HA. The following is the exception that stops the job. Caused by: java.util.concurrent.CompletionException: org.apache.flink.kubernetes.kubec

Re: Stream aggregation using Flink Table API (Blink plan)

2020-11-12 Thread Felipe Gutierrez
I see. now it has different query plans. It was documented on another page so I got confused. Thanks! -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On Thu, Nov 12, 2020 at 12:41 PM Jark Wu wrote: > > Hi Felipe, > > The default value of `table.optimiz

Re: Stream aggregation using Flink Table API (Blink plan)

2020-11-12 Thread Jark Wu
Hi Felipe, The default value of `table.optimizer.agg-phase-strategy` is AUTO, if mini-batch is enabled, if will use TWO-PHASE, otherwise ONE-PHASE. https://ci.apache.org/projects/flink/flink-docs-master/dev/table/config.html#table-optimizer-agg-phase-strategy On Thu, 12 Nov 2020 at 17:52, Felipe

Re: Stream aggregation using Flink Table API (Blink plan)

2020-11-12 Thread Felipe Gutierrez
Hi Jack, I don't get the difference from the "MiniBatch Aggregation" if compared with the "Local-Global Aggregation". On the web page [1] it says that I have to enable the TWO_PHASE parameter. So I compared the query plan from both, with and without the TWO_PHASE parameter. And they are the same.

Re: Stream aggregation using Flink Table API (Blink plan)

2020-11-10 Thread Felipe Gutierrez
I see, thanks Timo -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On Tue, Nov 10, 2020 at 3:22 PM Timo Walther wrote: > > Hi Felipe, > > with non-deterministic Jark meant that you never know if the mini batch > timer (every 3 s) or the mini batch thr

Re: Stream aggregation using Flink Table API (Blink plan)

2020-11-10 Thread Timo Walther
Hi Felipe, with non-deterministic Jark meant that you never know if the mini batch timer (every 3 s) or the mini batch threshold (e.g. 3 rows) fires the execution. This depends how fast records arrive at the operator. In general, processing time can be considered non-deterministic, because 1

Re: Stream aggregation using Flink Table API (Blink plan)

2020-11-10 Thread Felipe Gutierrez
Hi Jark, thanks for your reply. Indeed, I forgot to write DISTINCT on the query and now the query plan is splitting into two hash partition phases. what do you mean by deterministic time? Why only the window aggregate is deterministic? If I implement the ProcessingTimeCallback [1] interface is it

Re: Stream aggregation using Flink Table API (Blink plan)

2020-11-09 Thread Jark Wu
Hi Felipe, The "Split Distinct Aggregation", i.e. the "table.optimizer.distinct-agg.split.enabled" option, only works for distinct aggregations (e.g. COUNT(DISTINCT ...)). However, the query in your example is using COUNT(driverId). You can update it to COUNT(DISTINCT driverId), and it should sh

Re: Stream aggregation using Flink Table API (Blink plan)

2020-11-09 Thread Felipe Gutierrez
I realized that I forgot the image. Now it is attached. -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On Mon, Nov 9, 2020 at 1:41 PM Felipe Gutierrez wrote: > > Hi community, > > I am testing the "Split Distinct Aggregation" [1] consuming the taxi >

Stream aggregation using Flink Table API (Blink plan)

2020-11-09 Thread Felipe Gutierrez
Hi community, I am testing the "Split Distinct Aggregation" [1] consuming the taxi ride data set. My sql query from the table environment is the one below: Table tableCountDistinct = tableEnv.sqlQuery("SELECT startDate, COUNT(driverId) FROM TaxiRide GROUP BY startDate"); and I am enableing: conf

Re: How to ensure that job is restored from savepoint when using Flink SQL

2020-07-08 Thread shadowell
that is needed to compute the result of a query. If you update a query it is very well possible that the result computed by Flink won't be equal to the actual result of the new query. Best, Fabian Am Mo., 6. Juli 2020 um 10:50 Uhr schrieb shadowell : Hello, everyone, I hav

Re: How to ensure that job is restored from savepoint when using Flink SQL

2020-07-08 Thread Fabian Hueske
rse a significant limitation, that the community is aware of > and planning to improve in the future. > > I'd also like to add that it can be very difficult to assess whether it is > meaningful to start a query from a savepoint that was generated with a > different query. >

Re: How to ensure that job is restored from savepoint when using Flink SQL

2020-07-07 Thread shadowell
Uhr schrieb shadowell : Hello, everyone, I have some unclear points when using Flink SQL. I hope to get an answer or tell me where I can find the answer. When using the DataStream API, in order to ensure that the job can recover the state from savepoint after adjustment, it i

Re: How to ensure that job is restored from savepoint when using Flink SQL

2020-07-07 Thread Fabian Hueske
't be equal to the actual result of the new query. Best, Fabian Am Mo., 6. Juli 2020 um 10:50 Uhr schrieb shadowell : > > Hello, everyone, > I have some unclear points when using Flink SQL. I hope to get an > answer or tell me where I can find the answer. >

How to ensure that job is restored from savepoint when using Flink SQL

2020-07-06 Thread shadowell
Hello, everyone, I have some unclear points when using Flink SQL. I hope to get an answer or tell me where I can find the answer. When using the DataStream API, in order to ensure that the job can recover the state from savepoint after adjustment, it is necessary to specify

Re: Any python example with json data from Kafka using flink-statefun

2020-06-16 Thread Sunil
Thanks Gordon. Really appreciate your detailed response and this definitely helps. On 2020/06/17 04:45:11, "Tzu-Li (Gordon) Tai" wrote: > (forwarding this to user@ as it is more suited to be located there) > > Hi Sunil, > > With remote functions (using the Python SDK), messages sent to / from

Re: Any python example with json data from Kafka using flink-statefun

2020-06-16 Thread Tzu-Li (Gordon) Tai
(forwarding this to user@ as it is more suited to be located there) Hi Sunil, With remote functions (using the Python SDK), messages sent to / from them must be Protobuf messages. This is a requirement since remote functions can be written in any language, and we use Protobuf as a means for cross

Re: Developing Beam applications using Flink checkpoints

2020-05-20 Thread Eleanore Jin
link, but it seems is not so popular. > > On Tue, 2020-05-19 at 12:46 +0200, Arvid Heise wrote: > > Hi Ivan, > > > > I'm fearing that only a few mailing list users have actually deeper > > Beam experience. I only used it briefly 3 years ago. Most users here >

Re: Developing Beam applications using Flink checkpoints

2020-05-19 Thread Ivan San Jose
only used it briefly 3 years ago. Most users here > are using Flink directly to avoid these kinds of double-abstraction > issues. > > It might be better to switch to the Beam mailing list if you have > Beam-specific questions including how the Flink runner actually > translates th

Re: Developing Beam applications using Flink checkpoints

2020-05-19 Thread Arvid Heise
Hi Ivan, I'm fearing that only a few mailing list users have actually deeper Beam experience. I only used it briefly 3 years ago. Most users here are using Flink directly to avoid these kinds of double-abstraction issues. It might be better to switch to the Beam mailing list if you have

Re: Developing Beam applications using Flink checkpoints

2020-05-19 Thread Ivan San Jose
Actually I'm also thinking about how Beam coders are related with runner's serialization... I mean, on Beam you specify a coder per each Java type in order to Beam be able to serialize/deserialize that type, but then, is the runner used under the hood serializing/deserializing again the result, so

Re: Developing Beam applications using Flink checkpoints

2020-05-19 Thread Ivan San Jose
Yep, sorry if I'm bothering you but I think I'm still not getting this, how could I tell Beam to tell Flink to use that serializer instead of Java standard one, because I think Beam is abstracting us from Flink checkpointing mechanism, so I'm afraid that if we use Flink API directly we might break

Re: Developing Beam applications using Flink checkpoints

2020-05-19 Thread Arvid Heise
Hi Ivan, The easiest way is to use some implementation that's already there [1]. I already mentioned Avro and would strongly recommend giving it a go. If you make sure to provide a default value for as many fields as possible, you can always remove them later giving you great flexibility. I can gi

Re: Developing Beam applications using Flink checkpoints

2020-05-19 Thread Ivan San Jose
Thanks for your complete answer Arvid, we will try to approach all things you mentioned, but take into account we are using Beam on top of Flink, so, to be honest, I don't know how could we implement the custom serialization thing ( https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/s

Re: Developing Beam applications using Flink checkpoints

2020-05-18 Thread Arvid Heise
Hi Ivan, First let's address the issue with idle partitions. The solution is to use a watermark assigner that also emits a watermark with some idle timeout [1]. Now the second question, on why Kafka commits are committed for in-flight, checkpointed data. The basic idea is that you are not losing

Re: Using flink-connector-kafka-1.9.1 with flink-core-1.7.2

2020-05-18 Thread Arvid Heise
de if you cannot upgrade the Flink cluster. That works, for example, when working with Yarn. On Thu, May 14, 2020 at 3:22 PM Nick Bendtner wrote: > Hi Arvid, > I had no problems using flink Kafka connector 1.8.0 with flink 1.7.2 core > . > > Best > Nick > > On Thu, May 7

Developing Beam applications using Flink checkpoints

2020-05-15 Thread Ivan San Jose
Hi, we are starting to use Beam with Flink as runner on our applications, and recently we would like to get advantages that Flink checkpoiting provides, but it seems we are not understanding it clearly. Simplifying, our application does the following: - Read meesages from a couple of Kafka topic

Re: Using flink-connector-kafka-1.9.1 with flink-core-1.7.2

2020-05-06 Thread Arvid Heise
this will ultimately work as you are pretty much backporting some changes to the old version. [1] https://github.com/AHeise/flink On Thu, May 7, 2020 at 2:35 AM Nick Bendtner wrote: > Hi guys, > I am using flink 1.7.2 version. I have to deserialize data from kafka > into consume

Using flink-connector-kafka-1.9.1 with flink-core-1.7.2

2020-05-06 Thread Nick Bendtner
Hi guys, I am using flink 1.7.2 version. I have to deserialize data from kafka into consumer records therefore I decided to update the flink-connector-kafka to 1.9.1 which provides support for consumer record. We use child first class loading. However it seems like I have compatibility issue as I

Re: [Stateful Functions] Using Flink CEP

2020-04-14 Thread Oytun Tez
41 AM Igal Shilman wrote: > Hi, > > I'm not familiar with the other library that you have mentioned, and > indeed using Flink CEP from within a stateful function is not possible > within a single Flink job, as Gordon mentioned. > > I'm wondering what aspects of CEP

  1   2   3   4   >