Re: Question about in-place rescaling support in Flink k8s operator

2025-02-15 Thread gyula . fora
Hi!In place scaling is only supported and works correctly with the native mode, which is the default and recommended mode when using the k8s operator. With native mode it works well.I personally don’t have plans to work on the standalone mode or to add this support. It’s a bit tricky as you said. I

Re: Question around manually setting Flink jobId

2024-03-18 Thread Venkatakrishnan Sowrirajan
Thanks for the response, Asimansu. I should have been a bit more clearer and shared some additional context on our internal deployment. Currently, we are running *Flink in YARN application mode* for *batch* execution purposes (we also run it for stream execution as well). In the YARN application m

Re: Question around manually setting Flink jobId

2024-03-14 Thread Asimansu Bera
Hello Venkat, There are few ways to get the JobID from the client side. JobID is alpha numeric as 9eec4d17246b5ff965a43082818a3336. When you submit the job using flink command line client , Job is returned as Job has been submitted with JobID 9eec4d17246b5ff965a43082818a3336 1. using below comma

Re: Question around manually setting Flink jobId

2024-03-14 Thread Venkatakrishnan Sowrirajan
Junrui, Thanks for your answer for the above questions. Allison and I work together on Flink. One of the main questions is, is there an easy way to get the Flink "JobID" from the Flink client side? Without the "JobID", users have no way to access Flink HistoryServer other than searching through t

Re: Question around manually setting Flink jobId

2024-03-13 Thread Junrui Lee
Hi Allison, The PIPELINE_FIXED_JOB_ID configuration option is not intended for public use. IIUC, the only way to manually specify the jobId is submitting a job through the JAR RUN REST API, where you can provide the jobId in the request body ( https://nightlies.apache.org/flink/flink-docs-master/d

Re: Question about time-based operators with RocksDB backend

2024-03-06 Thread xia rui
Hi Gabriele, use (or extend) the window operator provided by Flink is a better idea. A window operator in Flink manages two types of state: - Window state: accumlate data for windows, and provide data to window function when a window comes to its end time. - Timer state: store the end tim

Re: Question about time-based operators with RocksDB backend

2024-03-05 Thread Jinzhong Li
Hi Gabriele, The keyed state APIs (ValueState、ListState、etc) are supported by all types of state backend (hashmap、rocksdb、etc.). And the built-in window operators are implemented with these state APIs internally. So you can use these built-in operators/functions with the RocksDB state backend

Re: Question about time-based operators with RocksDB backend

2024-03-04 Thread Zakelly Lan
Hi Gabriele, Quick answer: You can use the built-in window operators which have been integrated with state backends including RocksDB. Thanks, Zakelly On Tue, Mar 5, 2024 at 10:33 AM Zhanghao Chen wrote: > Hi Gabriele, > > I'd recommend extending the existing window function whenever possible

Re: Question about time-based operators with RocksDB backend

2024-03-04 Thread Zhanghao Chen
Hi Gabriele, I'd recommend extending the existing window function whenever possible, as Flink will automatically cover state management for you and no need to be concerned with state backend details. Incremental aggregation for reduce state size is also out of the box if your usage can be satis

Re: [Question] How to scale application based on 'reactive' mode

2023-10-23 Thread Dennis Jung
Also, the metrics is on a per-task granularity and allows us to identify >>> bottleneck tasks. >>>3. Autoscaler feature currently only works for K8s opeartor + native >>>K8s mode. >>> >>> >>> Best, >>> Zhanghao Chen >>&

Re: [Question] How to scale application based on 'reactive' mode

2023-09-07 Thread Gyula Fóra
;Also, the metrics is on a per-task granularity and allows us to identify >>bottleneck tasks. >>3. Autoscaler feature currently only works for K8s opeartor + native >>K8s mode. >> >> >> Best, >> Zhanghao Chen >> -

Re: [Question] How to scale application based on 'reactive' mode

2023-09-07 Thread Dennis Jung
k tasks. >3. Autoscaler feature currently only works for K8s opeartor + native >K8s mode. > > > Best, > Zhanghao Chen > -- > *发件人:* Dennis Jung > *发送时间:* 2023年9月2日 12:58 > *收件人:* Gyula Fóra > *抄送:* user@flink.apache.org > *主题:*

Re: Question regarding asyncIO timeout

2023-09-06 Thread liu ron
Hi, Leon > Besides that, Do you know if the async timeout is actually a global timeout? meaning it accounts for the time of each attempt call plus any interval time in between. Yes, the timeout is total timeout, you can see [1][2] for more detail. [1] https://cwiki.apache.org/confluence/pages/v

Re: Question regarding asyncIO timeout

2023-09-05 Thread Leon Xu
Hi Ken, Thanks for the suggestion. Definitely a good call to just wrap the retry inside the client code. I'll give it a try. Besides that, Do you know if the async timeout is actually a global timeout? meaning it accounts for the time of each attempt call plus any interval time in between. I incre

Re: Question regarding asyncIO timeout

2023-09-05 Thread Ken Krugler
Hi Leon, Normally I try to handle retrying in the client being used to call the server, as you have more control/context. If that’s not an option for you, then normally (un)orderedWaitWithRetry() should work - when you say “it doesn’t seem to help much”, are you saying that even with retry you

Re: [Question] How to scale application based on 'reactive' mode

2023-09-01 Thread Dennis Jung
Hello, Thanks for your notice. 1. In "Flink 1.18 + non-reactive", is parallelism being changed by the number of TM? 2. In the document( https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.6/docs/custom-resource/autoscaler/), it said "we are not using any container memory /

Re: [Question] How to scale application based on 'reactive' mode

2023-09-01 Thread Gyula Fóra
Pretty much, except that with Flink 1.18 autoscaler can scale the job in place without restarting the JM (even without reactive mode ) So actually best option is autoscaler with Flink 1.18 native mode (no reactive) Gyula On Fri, 1 Sep 2023 at 13:54, Dennis Jung wrote: > Thanks for feedback. >

Re: [Question] How to scale application based on 'reactive' mode

2023-09-01 Thread Dennis Jung
Thanks for feedback. Could you check whether I understand correctly? *Only using 'reactive' mode:* By manually adding TaskManager(TM) (such as using './bin/taskmanager.sh start'), parallelism will be increased. For example, when job parallelism is 1 and TM is 1, and if adding 1 new TM, JobManager

Re: [Question] How to scale application based on 'reactive' mode

2023-09-01 Thread Gyula Fóra
I would look at reactive scaling as a way to increase / decrease parallelism. It’s not a way to automatically decide when to actually do it as you need to create new TMs . The autoscaler could use reactive mode to change the parallelism but you need the autoscaler itself to decide when new resour

Re: [Question] How to scale application based on 'reactive' mode

2023-09-01 Thread Dennis Jung
For now, the thing I've found about 'reactive' mode is that it automatically adjusts 'job parallelism' when TaskManager is increased/decreased. https://www.slideshare.net/FlinkForward/autoscaling-flink-with-reactive-mode Is there some other feature that only 'reactive' mode offers for scaling? T

Re: [Question] How to scale application based on 'reactive' mode

2023-09-01 Thread Dennis Jung
Hello, Thank you for your response. I have few more questions in following: https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/elastic_scaling/ *Reactive Mode configures a job so that it always uses all resources available in the cluster. Adding a TaskManager will scale up

Re: [Question] How to scale application based on 'reactive' mode

2023-08-31 Thread Gyula Fóra
The reactive mode reacts to available resources. The autoscaler reacts to changing load and processing capacity and adjusts resources. Completely different concepts and applicability. Most people want the autoscaler , but this is a recent feature and is specific to the k8s operator at the moment.

Re: [Question] How to scale application based on 'reactive' mode

2023-08-31 Thread Dennis Jung
Hello, Thanks for your notice. Than what is the purpose of using 'reactive', if this doesn't do anything itself? What is the difference if I use auto-scaler without 'reactive' mode? Regards, Jung 2023년 8월 18일 (금) 오후 7:51, Gyula Fóra 님이 작성: > Hi! > > I think what you need is probably not the r

Re: [Question] How to scale application based on 'reactive' mode

2023-08-18 Thread Gyula Fóra
Hi! I think what you need is probably not the reactive mode but a proper autoscaler. The reactive mode as you say doesn't do anything in itself, you need to build a lot of logic around it. Check this instead: https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resou

Re: [Question] Good way to monitor data skewness

2023-08-17 Thread Dennis Jung
Hello, Thanks for feedback. I'll try to add setup in the internal Grafana. BR, JUNG 2023년 8월 16일 (수) 오후 6:29, Hang Ruan 님이 작성: > Hi, Dennis. > > As Ron said, we could judge this situation by the metrics. > We are usually reporting the metrics to the external system like > Prometheus by the metr

Re: [Question] Good way to monitor data skewness

2023-08-16 Thread Hang Ruan
Hi, Dennis. As Ron said, we could judge this situation by the metrics. We are usually reporting the metrics to the external system like Prometheus by the metric reporter[1]. And these metrics could be shown by some other tools like grafana[2]. Best, Hang [1] https://nightlies.apache.org/flink/fl

Re: [Question] Good way to monitor data skewness

2023-08-15 Thread liu ron
Hi, Dennis, Although all operators are chained together, each operator metrics is there, you can view the metrcis related to the corresponding operator's input and output records through the UI, as following: [image: image.png] Best, Ron Dennis Jung 于2023年8月16日周三 14:13写道: > Hello people, > I'

Re: Question about serialization of java.util classes

2023-08-15 Thread Alexis Sarda-Espinosa
Check this answer: https://stackoverflow.com/a/64721838/5793905 You could then use, for example, something like: new SetTypeInfo(Types.STRING) instead of Types.LIST(Types.STRING) Am Di., 15. Aug. 2023 um 10:40 Uhr schrieb : > Hello Alexis, > > Thank you for sharing the helper classes this but un

Re: Question about serialization of java.util classes

2023-08-15 Thread s
Hello Alexis, Thank you for sharing the helper classes this but unfortunately I have no idea how to use these classes or how they might be able to help me. This is all very new to me and I honestly can't wrap my head around Flink's type information system. Best regards, Saleh. > On 14 Aug 202

Re: Question about serialization of java.util classes

2023-08-14 Thread Alexis Sarda-Espinosa
Hello, AFAIK you cannot avoid TypeInformationFactory due to type erasure, nothing Flink can do about that. Here's an example of helper classes I've been using to support set serde in Flink POJOs, but note that it's hardcoded for LinkedHashSet, so you would have to create different implementations

Re: Question about serialization of java.util classes

2023-08-14 Thread s
Hi, Here's a minimal example using an ArrayList, a HashSet, and a TreeSet: ``` package com.example; import java.util.ArrayList; import java.util.HashSet; import java.util.TreeSet; import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; public class App { public static cla

Re: Question about serialization of java.util classes

2023-08-14 Thread Alexey Novakov via user
Hi Saleh, If you could show us the minimal code example of the issue (event classes), I think someone could help you to solve it. Best regards, Alexey On Mon, Aug 14, 2023 at 9:23 AM wrote: > Hi, > > According to this blog post > https://flink.apache.org/2020/04/15/flink-serialization-tuning-v

Re: Question about serialization of java.util classes

2023-08-14 Thread s
Hi, According to this blog post https://flink.apache.org/2020/04/15/flink-serialization-tuning-vol.-1-choosing-your-serializer-if-you-can/#pojoserializer The "Must be proc

Re: Question about serialization of java.util classes

2023-08-13 Thread liu ron
Hi, According to the test in [1], I think Flink can recognize Pojo class which contains java List, so I think you can refer to the related Pojo class implementation. [1] https://github.com/apache/flink/blob/fc2b5d8f53a41695117f6eaf4c798cc183cf1e36/flink-core/src/test/java/org/apache/flink/api/jav

Re: Question about Flink exception handling

2023-05-23 Thread Sharif Khan via user
Thanks for the clarification. On Tue, May 23, 2023 at 7:07 PM Weihua Hu wrote: > Hi Sharif, > > You could not catch exceptions globally. > > For exceptions that can be explicitly ignored for your business, you need > to add a try-catch in the operators. > For exceptions that are not catched, Fli

Re: Question about Flink exception handling

2023-05-23 Thread Weihua Hu
Hi Sharif, You could not catch exceptions globally. For exceptions that can be explicitly ignored for your business, you need to add a try-catch in the operators. For exceptions that are not catched, Flink will trigger a recovery from failure automatically[1]. [1] https://nightlies.apache.org/fl

Re: Question about Flink exception handling

2023-05-22 Thread Sharif Khan via user
Thanks for your response. For simplicity, I want to capture exceptions in a centralized manner and log them for further analysis, without interrupting the job's execution or causing it to restart. On Tue, May 23, 2023 at 6:31 AM Shammon FY wrote: > Hi Sharif, > > I would like to know what do yo

Re: Question about Flink exception handling

2023-05-22 Thread Shammon FY
Hi Sharif, I would like to know what do you want to do with the exception after catching it? There are different ways for different requirements, for example, Flink has already reported these exceptions. Best, Shammon FY On Mon, May 22, 2023 at 4:45 PM Sharif Khan via user wrote: > Hi, commun

Re: Question about Flink metrics

2023-05-04 Thread Mason Chen
Hi Neha, For the jobs you care about, you can attach additional labels using `scope-variables-additional` [1]. The example located in the same page showcases how you can configure KV pairs in its map configuration. Be sure to replace the reporter name with the name of your prometheus reporter! [1

Re: Question about match_recognize clause in Flink

2022-12-22 Thread Martijn Visser
Hi Marjan, That's rather weird, because PyFlink uses the same implementation. Could you file a Jira ticket? If not, let me know and I'll create one for you. Best regards, Martijn On Thu, Dec 22, 2022 at 11:37 AM Marjan Jordanovski wrote: > Hello, > > I am using custom made connector to create

Re: question about Async IO

2022-11-04 Thread David Anderson
Yes, that will work as you expect. So long as you don't put another shuffle or rebalance in between, the keyed partitioning that's already in place will carry through the async i/o operator, and beyond. In most cases you can even use reinterpretAsKeyedStream on the output (so long as you haven't do

Re: Question about UDF randomly processed input row twice

2022-11-03 Thread yuxia
n the StreamPhysicalCalc, as of result of which, it seems the one row will be processed for twice. Best regards, Yuxia 发件人: "Xinyi Yan" 收件人: "yuxia" 抄送: "User" 发送时间: 星期五, 2022年 11 月 04日 上午 5:28:30 主题: Re: Question about UDF randomly processed input row twice O

Re: Question about UDF randomly processed input row twice

2022-11-03 Thread Xinyi Yan
Ok. The datagen with sequence option can produce this issue easily, and it also resulted in an incorrect result. I have a sequence generated by datagen that starts from 1 to 5 and let the UDF randomly either return null or bytes. Surprisingly, not only the UDF has been executed twice but also the w

Re: Question about UDF randomly processed input row twice

2022-11-03 Thread yuxia
The dategen may produce rows with same values. >From my side, in Flink, the udf shouldn't process one row for twice, >otherwise, it should be a critical bug. Best regards, Yuxia 发件人: "Xinyi Yan" 收件人: "User" 发送时间: 星期四, 2022年 11 月 03日 上午 6:59:20 主题: Question about UDF randomly processed

Re: question about Async IO

2022-11-02 Thread Galen Warren
Thanks, that makes sense and matches my understanding of how it works. In my case, I don't actually need access to keyed *state*; I just want to make sure that all elements with the same key are routed to the same instance of the async function. (Without going into too much detail, the reason for

Re: question about Async IO

2022-11-02 Thread Filip Karnicki
Hi Galen I was thinking about the same thing recently and reached a point where I see that async io does not have access to the keyed state because: "* State related apis in [[org.apache.flink.api.common.functions.RuntimeContext]] are not supported * yet because the key may get changed while acc

Re: question about Async IO

2022-10-14 Thread Krzysztof Chmielewski
Hi Galen, i will tell from my experience as a Flink user and developer of Flink jobs. *"if the input to an AsyncFunction is a keyed stream, can I assume that all input elements with the same key will be handled by the same instance of the async operator"* >From what I know (and someone can corre

Re: Question about SQL gateway

2022-10-12 Thread Ww J
Thanks Xuyang. Jack > On Oct 12, 2022, at 8:46 AM, Xuyang wrote: > > Hi, currently I think there is no ha about gateway. When the gateway crashes, > the job about being submitted sync will be cancelled, and the async job will > continue running. When the gateway restarts, the async job could

Re: Question regarding to debezium format

2022-09-29 Thread Ali Bahadir Zeybek
Hello Edwin, Would you mind sharing a simple FlinkSQL DDL for the table you are creating with the kafka connector and dthe debezium-avro-confluent format? Also, can you elaborate on the mechanism who publishes initially to the schema registry and share the corresponding schema? In a nutshell, th

Re: Question regarding to debezium format

2022-09-29 Thread Martijn Visser
Hi Edwin, I'm suspecting that's because those fields are considered metadata which are treated separately. There's https://issues.apache.org/jira/browse/FLINK-20454 for adding the metadata support for the Debezium format with a PR provided, but not yet reviewed. If you could have a look at the PR

Re: Question Regarding State Migrations in Ververica Platform

2022-08-31 Thread Rion Williams
+dev > On Aug 30, 2022, at 11:20 AM, Rion Williams wrote: > >  > Hi all, > > I wasn't sure if this would be the best audience, if not, please advise if > you know of a better place to ask it. I figured that at least some folks here > either work for Ververica or might have used their platfor

Re: Question of Flink Operator Application Cluster Deployment

2022-05-18 Thread Xiao Ma
Hi Őrhidi, Thank you for helping out. I didn't try it on other k8s clusters. Our team is on the whole GKE environment. Is the psp the possible cause? I have given the secret volume in the psp, but not working. Best, *Xiao Ma* *Geotab* Software Developer, Data Engineering | B.Sc, M.Sc Direct

Re: Question of Flink Operator Application Cluster Deployment

2022-05-17 Thread Xiao Ma
w.youtube.com/user/MyGeotab> | LinkedIn >> <https://www.linkedin.com/company/geotab/> >> >> >> -- Forwarded message - >> From: Xiao Ma >> Date: Tue, May 17, 2022 at 4:18 PM >> Subject: Re: Question of Flink Operator Application Cluster Deploymen

Re: Question of Flink Operator Application Cluster Deployment

2022-05-17 Thread John Gerassimou
ssage - > From: Xiao Ma > Date: Tue, May 17, 2022 at 4:18 PM > Subject: Re: Question of Flink Operator Application Cluster Deployment > To: Őrhidi Mátyás > > > Fyi, I didn't manually mount the service account token into the job pod. > It is automatically mount

Re: Question of Flink Operator Application Cluster Deployment

2022-05-16 Thread Őrhidi Mátyás
You don't have to mount the service account explicitly, this should be auto-mounted for you. Please share your (redacted) yamls for the RBAC configs ( https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/rbac/#cluster-scoped-flink-operator-with-jobs-running-in-othe

Re: Question About Histograms

2022-04-05 Thread Anil K
Hello Prasanna, Thanks for your response, Could you elaborate on what you meant by "overriding the Prometheus Histogram class provided "? if possible with any samples? Regards, Anil On Tue, Apr 5, 2022 at 1:11 AM Prasanna kumar wrote: > Anil, > > Flink Histograms are actually summaries .. You n

Re: Question About Histograms

2022-04-04 Thread Prasanna kumar
Anil, Flink Histograms are actually summaries .. You need to override the Prometheus Histogram class provided to write it into different buckets to Prometheus .. Then you can write prom queries to calculate different quantiles accordingly ... Checkpointing The histograms is not a recommended opti

Re: Question about community collaboration options

2022-04-01 Thread Martijn Visser
Hi Ted, This is a great question. There are usually bi-weekly sync meetings to discuss plans and progress for the next Flink release. For example, there was a regular one for the Flink 1.15 release [1] I do see some things that we could improve on as a Flink community. For example, there are quit

Re: Question about Flink counters

2022-03-08 Thread Shane Bishop
Hi, My issue has been resolved through discussion with AWS support. It turns out that Kinesis Data Analytics reports to CloudWatch in a way I did not expect. The way to view the accurate values for Flink counters is with Average in CloudWatch metrics. Below is the response from AWS support, fo

Re: Question about Flink counters

2022-03-07 Thread Shane Bishop
Hi Dawid, My team's Flink application's primary purpose is not to count the number of SQS messages received or the number of successful or failed S3 downloads. The application's primary purpose is to process events and the corresponding data, and for each event, create or update a new entry in

Re: Question about Flink counters

2022-03-07 Thread Dawid Wysakowicz
amazon.com/AmazonCloudWatch/latest/monitoring/working_with_metrics.html *From:* Zhanghao Chen *Sent:* March 5, 2022 11:11 PM *To:* Shane Bishop ; user@flink.apache.org *Subject:* Re: Question about Flink counters Hi Sha

Re: Question about Flink counters

2022-03-06 Thread Shane Bishop
.html From: Zhanghao Chen Sent: March 5, 2022 11:11 PM To: Shane Bishop ; user@flink.apache.org Subject: Re: Question about Flink counters Hi Shane, Could you share more information on what you would like to use the counter for? The counter discussed here is primarily designed for exposing coun

Re: Question about Flink counters

2022-03-05 Thread Zhanghao Chen
e.org Subject: Re: Question about Flink counters If I used a thread-safe counter implementation, would that be enough to make the count correct for a Flink cluster with multiple machines? Best, Shane From: Zhanghao Chen Sent: March 4, 2022 11:08 PM To: Shane Bishop ;

Re: Question about Flink counters

2022-03-05 Thread Shane Bishop
If I used a thread-safe counter implementation, would that be enough to make the count correct for a Flink cluster with multiple machines? Best, Shane From: Zhanghao Chen Sent: March 4, 2022 11:08 PM To: Shane Bishop ; user@flink.apache.org Subject: Re

Re: Question about Flink counters

2022-03-04 Thread Zhanghao Chen
Hi Shane, Flink provides a generic counter interface with a few implementations. The default implementation SimpleCounter, which is not thread-safe, is used when you calling counter(String name) on a MetricGroup. Therefore, you'll need to use your own thread-safe implementation, check out the s

Re: question on dataSource.collect() on reading states from a savepoint file

2022-02-10 Thread Antonio Si
Thanks Bastien. I will check it out. Antonio. On Thu, Feb 10, 2022 at 11:59 AM bastien dine wrote: > I haven't used s3 with Flink, but according to this doc : > https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/filesystems/s3/ > You can setup pretty easily s3 and use it

Re: question on dataSource.collect() on reading states from a savepoint file

2022-02-10 Thread bastien dine
I haven't used s3 with Flink, but according to this doc : https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/filesystems/s3/ You can setup pretty easily s3 and use it with s3://path/to/your/file with a write sink The page talk about DataStream but it should work with DataSet

Re: question on dataSource.collect() on reading states from a savepoint file

2022-02-10 Thread Antonio Si
Thanks Bastien. Can you point to an example of using a sink as we are planning to write to S3? Thanks again for your help. Antonio. On Thu, Feb 10, 2022 at 11:49 AM bastien dine wrote: > Hello Antonio, > > .collect() method should be use with caution as it's collecting the > DataSet (multiple

Re: question on dataSource.collect() on reading states from a savepoint file

2022-02-10 Thread bastien dine
Hello Antonio, .collect() method should be use with caution as it's collecting the DataSet (multiple partitions on multiple TM) into a List single list on JM (so in memory) Unless you have a lot of RAM, you can not use it this way and you probably should not I recommend you to use a sink to print

Re: Question around creating partitioned tables when running Flink from Ververica

2022-02-04 Thread Ingo Bürk
Hi Natu, yes, in the docs we refer to it as custom catalogs, I apologize for the confusion. Custom catalogs can be added through the UI similar to custom connectors / formats. Best Ingo On 04.02.22 08:45, Natu Lauchande wrote: Hey Ingo, Thanks for the quick response. I will bother you a b

Re: Question around creating partitioned tables when running Flink from Ververica

2022-02-03 Thread Natu Lauchande
Hey Ingo, Thanks for the quick response. I will bother you a bit more : ). We have never used external catalogs do you perhaps have a link that we can look at ? The only reference that i see online is for custom catalogs is this the same as external catalogs: https://docs.ververica.com/user_guid

Re: Question around creating partitioned tables when running Flink from Ververica

2022-02-03 Thread Ingo Bürk
Hi Natu, the functionality hasn't been actively blocked, it just hasn't yet been implemented in the Ververica Platform Built-In Catalog. Using any external catalog which supports partitioning will work fine. I'll make a note internally for your request on this, though I cannot make any state

Re: Question about MapState size

2022-01-23 Thread Yun Tang
Hi Abdul, What does "only count pertaining to the specific key of partition" mean? The counting size is for the map related to a specific selected key or the all the maps in the whole map state? You can leverage RocksDB's native metrics to monitor the rocksDB usage, such as total-sst-files-siz

Re: Question - Filesystem connector for lookup table

2022-01-20 Thread Martijn Visser
Hi Jason, The best option would indeed be to make the dimension data available in something like a database which you can access via JDBC, HBase or Hive. Those do support lookups. Best regards, Martijn On Thu, 20 Jan 2022 at 22:11, Jason Yi <93t...@gmail.com> wrote: > Thanks for the quick resp

Re: Question - Filesystem connector for lookup table

2022-01-20 Thread Jason Yi
Thanks for the quick response. Is there any best or suggested practice for the use case of when we have data sets in a filesystem that we want to use in Flink as reference data (like dimension data)? - Would making dimension data a Hive table or loading it into a table in RDBMS (like MySQL)

Re: Question - Filesystem connector for lookup table

2022-01-20 Thread Martijn Visser
Hi Jason, It's not (properly) supported and we should update the documentation. There is no out of the box possibility to use a file from filesystem as a lookup table as far as I know. Best regards, Martijn Op do 20 jan. 2022 om 18:44 schreef Jason Yi <93t...@gmail.com> > Hello, > > I have da

Re: Question about plain password in flink-conf.yaml

2022-01-18 Thread 狗嗖
Thanks for your reply.  We can try this on the CLI, but what about the Web UI? Thanks, Jerry ---Original--- From: "Gabor Somogyi"

Re: Question about plain password in flink-conf.yaml

2022-01-18 Thread Gabor Somogyi
export SSL_PASSWORD=secret flink run -yDsecurity.ssl.rest.*-password=$SSL_PASSWORD ... app.jar Such way the code which starts the workload can store the passwords in a centrally protected area. This still can be hacked but at least not stored in plain text file. BR, G On Tue, Jan 18, 2022 at 10

Re: question about Statefun/Flink version compatibility

2022-01-10 Thread Igal Shilman
Hello Galen, StateFun is using some internal APIs so they might or might not stay compatible between versions. You can try bump the version If it compiles cleanly, most likely this would work. We will be porting the main branch to Flink 1.14 this or next week. Cheers, Igal. On Mon, Jan 10, 2022 a

Re: question on jar compatibility - log4j related

2021-12-19 Thread David Morávek
Hi Eddie, the APIs should be binary compatible across patch releases, so there is no need to re-compile your artifacts Best, D. On Sun 19. 12. 2021 at 16:42, Colletta, Edward wrote: > If have jar files built using flink version 11.2 in dependencies, and I > upgrade my cluster to 11.6, is it sa

Re: Question about relationship between operator instances and keys

2021-12-02 Thread David Morávek
Hi haocheng, in short it works as follows: - Each parallel instance of an operator is responsible for one to N key groups. - Each parallel instance belongs to a slot, which is tied with a single thread (slot may actually introduce multiple subtasks) - # of keygroups for each operator = max parall

Re: Question on BoundedOutOfOrderness

2021-11-05 Thread Oliver Moser
Thanks Guowei and Alexey, looking at the references you provided helped. I managed to put together simple examples using both the streaming Table API as well as using the CEP library, and I’m now able to process events in event time order. I believe to me my interpretation of setting up watermar

Re: Question on BoundedOutOfOrderness

2021-11-03 Thread Guowei Ma
Hi Oliver I think Alexey is right that you could not assume that the record would be output in the event time order. And there is a small addition.I see your output and there are actually multiple concurrencies (probably 11 subtasks). You also can't expect these concurrencies to be ordered accordi

Re: Question on BoundedOutOfOrderness

2021-11-02 Thread Alexey Trenikhun
Hi Oliver, I believe you also need to do sort, out of order ness watermark strategy only “postpone” watermark for given expected maximum of out of orderness. Check Ververica example - https://github.com/ververica/flink-training-exercises/blob/master/src/main/java/com/ververica/flinktraining/exam

Re: [外部邮件] Re: Question about flink sql

2021-11-01 Thread Caizhi Weng
Hi! Executing a set of statements with SQL client is supported since Flink 1.13 [1]. Please consider upgrading your Flink version. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sqlclient/#execute-a-set-of-sql-statements 方汉云 于2021年11月1日周一 下午8:31写道: > Hi, > > >

回复:[外部邮件] Re: Question about flink sql

2021-11-01 Thread 方汉云
Hi, I used offical flink-1.12.5 package,configuration sql-client-defaults.yaml,run bin/sql-client.sh embedded cat conf/sql-client-defaults.yaml catalogs: # A typical catalog definition looks like: - name: myhive type: hive hive-conf-dir: /apps/conf/hive default-database: de

Re: Question about flink sql

2021-11-01 Thread Jingsong Li
Hi, If you are using sql-client, you can try: https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sqlclient/#execute-a-set-of-sql-statements If you are using TableEnvironment, you can try statement set too: https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/common/

Re: Question about flink sql

2021-10-29 Thread Jake
Hi You can use like this: ```java val calciteParser = new CalciteParser(SqlUtil.getSqlParserConfig(tableEnv.getConfig)) sqlArr .foreach(item => { println(item) val itemNode = calciteParser.parse(item) itemNode match { case sqlSet: SqlSet => {

Re: [Question] Basic Python examples.wordcount on local FlinkRunner

2021-09-03 Thread Adam Pearce
s or resources, I’d very much appreciate it! Thanks again for the help! Best, Adam From: Dian Fu Date: Friday, September 3, 2021 at 2:10 AM To: Adam Pearce Cc: user@flink.apache.org Subject: [EXTERNAL]Re: [Question] Basic Python examples.wordcount on local FlinkRunner This seems more like a

Re: [Question] Basic Python examples.wordcount on local FlinkRunner

2021-09-02 Thread Dian Fu
This seems more like a Beam issue although it uses Flink runner. It would be helpful to also send it to the Beam user mailing list. Regarding to this issue itself, could you check is input.txt accessible in the Docker container? Regards, Dian > 2021年9月3日 上午5:19,Adam Pearce 写道: > > Hello all,

Re: Question about POJO rules - why fields should be public or have public setter/getter?

2021-07-14 Thread Timo Walther
Hi Naehee, the serializer for case classes is generated using the Scala macro that is also responsible for extracting the TypeInformation implcitly from your DataStream API program. It should be possible to use POJO serializer with case classes. But wouldn't it be easier to just use regular

Re: Question about POJO rules - why fields should be public or have public setter/getter?

2021-07-12 Thread Naehee Kim
Hi Dawid, Thanks for your reply. Good to know it is due to historic and compatibility reasons. The reason why I started looking into POJO rules is to understand if Scala Case Class can conform to POJO rules to support schema evolution. In our case, we store several Scala Case Classes to RocksDB s

Re: Question about POJO rules - why fields should be public or have public setter/getter?

2021-07-08 Thread Dawid Wysakowicz
Hi Naehee, Short answer would be for historic reasons and compatibility reasons. It was implemented that way back in the days and we don't want to change the default type extraction logic. Otherwise user jobs that rely on the default type extraction logic for state storing would end up with a stat

Re: [Question] Why doesn't Flink use Calcite adapter?

2021-06-27 Thread JING ZHANG
Hi guangyuan, The question is an interesting and broad topic. I try to give my opinion based on my limited knowledge. Flink introduces dynamic sources to read from an external system[1]. Flink connector modules are completely decoupled with Calcite. There are two benefits: (1) If users need to dev

Re: [Question] Why doesn't Flink use Calcite adapter?

2021-06-27 Thread Israel Ekpo
Maybe this question was better addressed to the DEV list. On Fri, Jun 25, 2021 at 11:57 PM guangyuan wang wrote: > > > > I have read the design doc of the Flink planner recently. I've found the > Flink only uses Calcite as an SQL optimizer. It

Re: question about concating an array in flink sql

2021-06-10 Thread JING ZHANG
Hi vtygoss, If the length of names is fixed, please try this 'select id, concat_ws(',',names[1],names[2],names[3]) from test', and note begin with 1 instead of 0. Else maybe you need to define a custom UDF which receives two arguments, first is string as separator, second is a string array as conte

Re: Question about State TTL and Interval Join

2021-06-06 Thread Yun Tang
Hi Chris, Interval Join should clean state which is not joined during interval and you don't need to set state TTL. (Actually, the states used in interval join are not exposed out and you cannot set TTL for those state as TTL is only public for user self-described states.) The checkpoint size

Re: Question about State TTL and Interval Join

2021-06-04 Thread JING ZHANG
Hi Chris, There is no need to state TTL if stateful operators only contain IntervalJoin. Please check the watermark of two input streams, does the watermark not advance for a long time? Best regards, JING ZHANG McBride, Chris 于2021年6月5日周六 上午3:17写道: > We currently have a flink 1.8 application de

Re: Question regarding cpu limit config in Flink standalone mode

2021-05-07 Thread Fan Xie
, 2021 8:39 PM To: user@flink.apache.org Subject: Re: Question regarding cpu limit config in Flink standalone mode Hi Fan, For a java application, you cannot specify how many cpu a process should use. The JVM process will always try to use as much cpu time as it needs. The limitation can only come

Re: Question regarding cpu limit config in Flink standalone mode

2021-05-06 Thread Xintong Song
Hi Fan, For a java application, you cannot specify how many cpu a process should use. The JVM process will always try to use as much cpu time as it needs. The limitation can only come from external: hardware limit, OS scheduling, cgroups, etc. On Kubernetes, it is the pod's resource specification

  1   2   3   4   5   >