Hi!
Have any other devs noticed issues with Flink missing Kafka records with
long-running Flink jobs? When I re-run my Flink job and start from the
earliest Kafka offset, Flink processes the events correctly. I'm using
Flink v1.11.1.
I have a simple job that takes records (Requests) from Kafka
Hi all,
I am building an application that launches Flink Jobs and monitors them.
I want to use the JobListener interface to output job evemts to a Kafka Topic.
The problem:
In the application we have RuleId, i.e. business logic identifier for the job,
and there’s JobId which is the internal
Hi John,
Logically the maximum retained checkpoints are configured
by state.checkpoints.num-retained [1]. Have you configured
this option?
Best,
Yun
[1]
https://ci.apache.org/projects/flink/flink-docs-stable/deployment/config.html#state-checkpoints-num-retained
hi,I met an appearance like this:
this is my sql:
SELECT distinct header,label,reqsig,dense_feat,item_id_feat,user_id_feat FROM
app.app_ranking_feature_table_clk_ord_hp_new_all_tree_orc where dt='2021-04-01'
When I useBlinkPlanner inBatchMode, It works well; But if I set inStreamMode,
It cau
environment:
flinksql 1.12.2
k8s session mode
description:
I got follow error log when my kafka connector port was wrong
>
2021-04-25 16:49:50
org.apache.kafka.common.errors.TimeoutException: Timeout of 6ms expired
before the position for partition filebeat_json_install_log-3 could
>
> Now, I'm just worried about the state size. State size will grow forever.
> There is no TTL.
The potential for unbounded state is certainly a problem, and it's going to
be a problem no matter how you implement the deduplication. Standard
techniques for mitigating this include (1) limiting the
Hi Dian,
I tried your suggestion but had the same error message unfortunately. I
also tried file:/ and file:// with the same error, not sure what's going
on, I assume writing to avro works fine in java and scala?
Eddie
On Sat, Apr 24, 2021 at 10:03 PM Dian Fu wrote:
> I guess you only need fil
Hi,
I'm curious why Event Time Temporal Join needs watermarks from both sides to
perform join.
Shouldn't watermark on versioned table side be enough to perform join ?
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hey Yang, Community
As been discussed few weeks ago, I'm working on Application Cluster - Native
K8s approach, running Flink 1.12.2.
We deploy application clusters programmatically which works well.
In addition, we leverage Kubernetes client(Fabric8io) to watch the
deployment/pods status and get
1. What if there are a very high number of users, like a million customers
won't the service crash? Is it advisable to hold the data in memory.
2. What if state-functions are used to calculate the value ? How will this
approach differ from the one proposed below.
Regards,
Swagat
On Wed, Apr 21,
Can a Kafka Consumer Source have more tasks run in parallel than the number
of partitions for the topic it is the source of? Or is the max parallelism
of the source constrained by max partitions of the topic?
Hi Eddie,
I have tried your program with the following changes and it could execute
successfully:
- Replace
`rf"file:///{os.getcwd()}/lib/flink-sql-avro-1.12.2.jar;file:///{os.getcwd()}/lib/flink-avro-1.12.2.jar;file:///{os.getcwd()}/lib/avro-1.10.2.jar”`
with
rf`"file:///Users/dianfu/code/src
No. But I decided to disable it finally
On Sun., Apr. 25, 2021, 5:14 a.m. Yun Gao, wrote:
> Hi John,
>
> Logically the maximum retained checkpoints are configured
> by state.checkpoints.num-retained [1]. Have you configured
> this option?
>
>
> Best,
> Yun
>
> [1]
> https://ci.apache.org/project
It seems that the JobListener interface could not expose such
information. Maybe you can set the RuleId as the jobName(or the suffix
of the jobName) of the application, then you can get the mappings of
jobId to jobName(RuleId) throw /jobs/overview.
[1]
https://ci.apache.org/projects/flink/flink-d
Hi, maverick.
The watermark is used to determine the message is late or early. If we only
use the watermark on versioned table side, we have no means to determine
whether the event in the main stream is ready to emit.
Best,
Shengkai
maverick 于2021年4月26日周一 上午2:31写道:
> Hi,
> I'm curious why Even
My Flink job failed to checkpoint with a "The job has failed" error. The
logs contained no other recent errors. I keep hitting the error even if I
cancel the jobs and restart them. When I restarted my jobmanager and
taskmanager, the error went away.
What error am I hitting? It looks like there
1. It always depends on the data volume per user. A million user is not
much if you compare it to the biggest Flink installations (Netflix,
Alibaba, PInterest, ...). However, for a larger scale and scalability, I'd
recommend to use rocksDB state backend. [1]
2. Are you referring to statefun? I'd s
Hi Shengkai,
Thanks for the answer. The question is do we need to determine if an
event in the main stream is late.
Let's look at interval join - event is emitted as soon as there is a
match between left and right stream.
I agree the watermark should pass on versioned table side, because
this is th
Hi,
Yes you are correct that if an event can not match any pattern it won't
be stored in state. If you process your records in event time it might
be stored for a little while before processing in order to sort the
incoming records based on time. Once a Watermark with a higher timestamp
comes it w
Hi,
MATCH_RECOGNIZE clause in SQL standard does not support different
contiguities. The MATCH_RECOGNIZE always uses the strict contiguity.
Best,
Dawid
On 21/04/2021 00:02, tbud wrote:
> There's 3 different types of Contiguity defined in the CEP documentation [1]
> looping + non-looping -- Stri
Hi, could you tell me which version do you use? I just want to check
whether there are any problems.
Best,
Shengkai
张颖 于2021年4月25日周日 下午5:23写道:
> hi,I met an appearance like this:
>
> this is my sql:
> SELECT distinct header,label,reqsig,dense_feat,item_id_feat,user_id_feat
> FROM app.app_rankin
Hi, Tony.
What is the version of your flink-dist. AFAIK, this issue should be
addressed in FLINK-15852[1]. Could you give the client log of case
2(set the log level to DEBUG would be better).
[1] https://issues.apache.org/jira/browse/FLINK-15852
Best,
Yangze Guo
On Sun, Apr 25, 2021 at 11:33 AM
Hi, Gil
IIUC, you want to deploy Flink cluster using YAML files yourselves and
want to know whether the JM should be deployed as Job[1] or
Deployment. If that is the case, as Matthias mentioned, Flink provides
two ways to integrate with K8S [2][3], in [3] the JM will be deployed
as a Deployment.
Hi Dan,
can you provide me with the JobManager logs to take a look as well? (This
will also tell me which Flink version you are using)
On Mon, Apr 26, 2021 at 7:20 AM Dan Hill wrote:
> My Flink job failed to checkpoint with a "The job has failed" error. The
> logs contained no other recent e
Hey Prashant,
the Kafka Consumer parallelism is constrained by the number of partitions
the topic(s) have. If you have configured the Kafka Consumer in Flink with
a parallelism of 100, but your topic has only 20 partitions, 80 consumer
instances in Flink will be idle.
On Mon, Apr 26, 2021 at 2:54
25 matches
Mail list logo