Flink missing Kafka records

2021-04-25 Thread Dan Hill
Hi! Have any other devs noticed issues with Flink missing Kafka records with long-running Flink jobs? When I re-run my Flink job and start from the earliest Kafka offset, Flink processes the events correctly. I'm using Flink v1.11.1. I have a simple job that takes records (Requests) from Kafka

Receiving context information through JobListener interface

2021-04-25 Thread Barak Ben Nathan
Hi all, I am building an application that launches Flink Jobs and monitors them. I want to use the JobListener interface to output job evemts to a Kafka Topic. The problem: In the application we have RuleId, i.e. business logic identifier for the job, and there’s JobId which is the internal

Re: Too man y checkpoint folders kept for externalized retention.

2021-04-25 Thread Yun Gao
Hi John, Logically the maximum retained checkpoints are configured by state.checkpoints.num-retained [1]. Have you configured this option? Best, Yun [1] https://ci.apache.org/projects/flink/flink-docs-stable/deployment/config.html#state-checkpoints-num-retained

Read Hive table in Stream Mode use distinct cause heap OOM

2021-04-25 Thread 张颖
hi,I met an appearance like this: this is my sql: SELECT distinct header,label,reqsig,dense_feat,item_id_feat,user_id_feat FROM app.app_ranking_feature_table_clk_ord_hp_new_all_tree_orc where dt='2021-04-01' When I useBlinkPlanner inBatchMode, It works well; But if I set inStreamMode, It cau

The wrong Options of Kafka Connector, will make the cluster can not run any job

2021-04-25 Thread chenxuying
environment: flinksql 1.12.2 k8s session mode description: I got follow error log when my kafka connector port was wrong > 2021-04-25 16:49:50 org.apache.kafka.common.errors.TimeoutException: Timeout of 6ms expired before the position for partition filebeat_json_install_log-3 could

Re: Approaches for external state for Flink

2021-04-25 Thread David Anderson
> > Now, I'm just worried about the state size. State size will grow forever. > There is no TTL. The potential for unbounded state is certainly a problem, and it's going to be a problem no matter how you implement the deduplication. Standard techniques for mitigating this include (1) limiting the

Re: Writing to Avro from pyflink

2021-04-25 Thread Edward Yang
Hi Dian, I tried your suggestion but had the same error message unfortunately. I also tried file:/ and file:// with the same error, not sure what's going on, I assume writing to avro works fine in java and scala? Eddie On Sat, Apr 24, 2021 at 10:03 PM Dian Fu wrote: > I guess you only need fil

Watermarks in Event Time Temporal Join

2021-04-25 Thread maverick
Hi, I'm curious why Event Time Temporal Join needs watermarks from both sides to perform join. Shouldn't watermark on versioned table side be enough to perform join ? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Application cluster - Job execution and cluster creation timeouts

2021-04-25 Thread Tamir Sagi
Hey Yang, Community As been discussed few weeks ago, I'm working on Application Cluster - Native K8s approach, running Flink 1.12.2. We deploy application clusters programmatically which works well. In addition, we leverage Kubernetes client(Fabric8io) to watch the deployment/pods status and get

Re: Flink Event specific window

2021-04-25 Thread Swagat Mishra
1. What if there are a very high number of users, like a million customers won't the service crash? Is it advisable to hold the data in memory. 2. What if state-functions are used to calculate the value ? How will this approach differ from the one proposed below. Regards, Swagat On Wed, Apr 21,

kafka consumers partition count and parallelism

2021-04-25 Thread Prashant Deva
Can a Kafka Consumer Source have more tasks run in parallel than the number of partitions for the topic it is the source of? Or is the max parallelism of the source constrained by max partitions of the topic?

Re: Writing to Avro from pyflink

2021-04-25 Thread Dian Fu
Hi Eddie, I have tried your program with the following changes and it could execute successfully: - Replace `rf"file:///{os.getcwd()}/lib/flink-sql-avro-1.12.2.jar;file:///{os.getcwd()}/lib/flink-avro-1.12.2.jar;file:///{os.getcwd()}/lib/avro-1.10.2.jar”` with rf`"file:///Users/dianfu/code/src

Re: Too man y checkpoint folders kept for externalized retention.

2021-04-25 Thread John Smith
No. But I decided to disable it finally On Sun., Apr. 25, 2021, 5:14 a.m. Yun Gao, wrote: > Hi John, > > Logically the maximum retained checkpoints are configured > by state.checkpoints.num-retained [1]. Have you configured > this option? > > > Best, > Yun > > [1] > https://ci.apache.org/project

Re: Receiving context information through JobListener interface

2021-04-25 Thread Yangze Guo
It seems that the JobListener interface could not expose such information. Maybe you can set the RuleId as the jobName(or the suffix of the jobName) of the application, then you can get the mappings of jobId to jobName(RuleId) throw /jobs/overview. [1] https://ci.apache.org/projects/flink/flink-d

Re: Watermarks in Event Time Temporal Join

2021-04-25 Thread Shengkai Fang
Hi, maverick. The watermark is used to determine the message is late or early. If we only use the watermark on versioned table side, we have no means to determine whether the event in the main stream is ready to emit. Best, Shengkai maverick 于2021年4月26日周一 上午2:31写道: > Hi, > I'm curious why Even

Checkpoint error - "The job has failed"

2021-04-25 Thread Dan Hill
My Flink job failed to checkpoint with a "The job has failed" error. The logs contained no other recent errors. I keep hitting the error even if I cancel the jobs and restart them. When I restarted my jobmanager and taskmanager, the error went away. What error am I hitting? It looks like there

Re: Flink Event specific window

2021-04-25 Thread Arvid Heise
1. It always depends on the data volume per user. A million user is not much if you compare it to the biggest Flink installations (Netflix, Alibaba, PInterest, ...). However, for a larger scale and scalability, I'd recommend to use rocksDB state backend. [1] 2. Are you referring to statefun? I'd s

Re: Watermarks in Event Time Temporal Join

2021-04-25 Thread Maciej Bryński
Hi Shengkai, Thanks for the answer. The question is do we need to determine if an event in the main stream is late. Let's look at interval join - event is emitted as soon as there is a match between left and right stream. I agree the watermark should pass on versioned table side, because this is th

Re: Contiguity and state storage in CEP library

2021-04-25 Thread Dawid Wysakowicz
Hi, Yes you are correct that if an event can not match any pattern it won't be stored in state. If you process your records in event time it might be stored for a little while before processing in order to sort the incoming records based on time. Once a Watermark with a higher timestamp comes it w

Re: Contiguity in SQL vs CEP

2021-04-25 Thread Dawid Wysakowicz
Hi, MATCH_RECOGNIZE clause in SQL standard does not support different contiguities. The MATCH_RECOGNIZE always uses the strict contiguity. Best, Dawid On 21/04/2021 00:02, tbud wrote: > There's 3 different types of Contiguity defined in the CEP documentation [1] > looping + non-looping -- Stri

Re: Read Hive table in Stream Mode use distinct cause heap OOM

2021-04-25 Thread Shengkai Fang
Hi, could you tell me which version do you use? I just want to check whether there are any problems. Best, Shengkai 张颖 于2021年4月25日周日 下午5:23写道: > hi,I met an appearance like this: > > this is my sql: > SELECT distinct header,label,reqsig,dense_feat,item_id_feat,user_id_feat > FROM app.app_rankin

Re: when should `FlinkYarnSessionCli` be included for parsing CLI arguments?

2021-04-25 Thread Yangze Guo
Hi, Tony. What is the version of your flink-dist. AFAIK, this issue should be addressed in FLINK-15852[1]. Could you give the client log of case 2(set the log level to DEBUG would be better). [1] https://issues.apache.org/jira/browse/FLINK-15852 Best, Yangze Guo On Sun, Apr 25, 2021 at 11:33 AM

Re: Kubernetes Setup - JM as job vs JM as deployment

2021-04-25 Thread Yangze Guo
Hi, Gil IIUC, you want to deploy Flink cluster using YAML files yourselves and want to know whether the JM should be deployed as Job[1] or Deployment. If that is the case, as Matthias mentioned, Flink provides two ways to integrate with K8S [2][3], in [3] the JM will be deployed as a Deployment.

Re: Checkpoint error - "The job has failed"

2021-04-25 Thread Robert Metzger
Hi Dan, can you provide me with the JobManager logs to take a look as well? (This will also tell me which Flink version you are using) On Mon, Apr 26, 2021 at 7:20 AM Dan Hill wrote: > My Flink job failed to checkpoint with a "The job has failed" error. The > logs contained no other recent e

Re: kafka consumers partition count and parallelism

2021-04-25 Thread Robert Metzger
Hey Prashant, the Kafka Consumer parallelism is constrained by the number of partitions the topic(s) have. If you have configured the Kafka Consumer in Flink with a parallelism of 100, but your topic has only 20 partitions, 80 consumer instances in Flink will be idle. On Mon, Apr 26, 2021 at 2:54