Hi,
I am currently deploying a Flink pipeline using Flink Kubernetes Operator
v1.9 alongside Flink version 1.19.1 as part of a comprehensive use case.
When the use case is undeployed, I need to ensure that the Flink pipeline
is properly canceled and the Flink cluster is taken down.
My approach
Hi,
We are exploring options to support canary upgrades for our Flink pipelines
and have considered the following approach.
1.
*Stateless Pipelines*: For stateless pipelines, we define two clusters,
job-v1 and job-v2, using the same consumer group to avoid event
duplication. To contr
Hello Kartik,
For your case, if events ingested/Second is 300/60=5 and payload size is
2kb , per second, ingestion size 5*2k=10kb. Network buffer size is 32kb by
default. You can also decrease the value to 16k.
https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/config/#tas
Thank you. I will check and get back on both the sugesstions made by
Asimansu and Xuyang.
I am using Flink 1.17.0
Regards,
Kartik
On Mon, Apr 1, 2024, 5:13 AM Asimansu Bera wrote:
> Hello Karthik,
>
> You may check the execution-buffer-timeout-interval parameter. This value
> is an important o
Hello Karthik,
You may check the execution-buffer-timeout-interval parameter. This value
is an important one for your case. I had a similar issue experienced in the
past.
https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/deployment/config/#execution-buffer-timeout-interval
For your
Hello,
I have a Streaming event processing job that looks like this.
*Source - ProcessFn(3 in total) - Sink*
I am seeing a delay of 50ms to 250ms between each operators (maybe
buffering or serde delays) leading to a slow end- to-end processing. What
could be the reason for such high latency?
So
Hi Sigalit,
In your settings, I guess each job will only have one slot (parallelism). So is
it too many input for your jobs with parallelism only one? One easy way to
confirm is that you increase your slots and job parallelism twice and then see
whether the QPS is increased.
Hope this would
Sent: Thursday, April 7, 2022 19:12
To: user
Subject: flink pipeline handles very small amount of messages in a second (only
500)
hi all
I would appreciate some help to understand the pipeline behaviour...
We deployed a standalone flink cluster. The pipelines are deployed via the jm
rest api.
We
hi all
I would appreciate some help to understand the pipeline behaviour...
We deployed a standalone flink cluster. The pipelines are deployed via
the jm rest api.
We have 5 task managers with 1 slot each.
In total i am deploying 5 pipelines which mainly read from kafka, a simple
object conversi
@d...@spark.apache.org @user
I am looking for an example around the observability framework for Apache
Flink pipelines.
This could be message tracing across multiple flink pipelines or query on
the past state of a message that was processed by any flink pipeline.
If anyone has done similar work
Are the number of sinks fixed? If so, then you can just take the output
of your map function and apply multiple filters, writing the output of
each filter into a sync. You could also use a process function with
side-outputs, and apply a source to each output.
On 10/14/2020 6:05 PM, Vignesh Ram
My requirement is to send the data to a different ES sink (based on the
data). Ex: If the data contains a particular info send it to sink1 else
send it to sink2 etc(basically send it dynamically to any one sink based on
the data). I also want to set parallelism separately for ES sink1, ES
sink2, Es
Hi Aissa
Looks like your requirements is to enrich a real stream data(from kafka) with
dimension data(your case will like: {sensor_id, equipment_id, workshop_id,
factory_id} ), you can achieve your purpose by Flink DataStream API or just use
FLINK SQL. I think use pure SQL will be esaier if you
You will have to enrich the data coming in for eg- { "equipment-id" :
"1-234", "sensor-id" : "1-vcy", . } . Since you will most likely have
a keyedstream based on equipment-id+sensor-id or equipment-id, you can have
a control stream with data about equipment to workshop/factory mapping
somethin
Hello Guys,
I am new to the real-time streaming field, and I am trying to build a BIG
DATA architecture for processing real-time streaming. I have some sensors
that generate data in json format, they are sent to Apache kafka cluster
then i want to consume them with Apache flinkin ordre to do some
a
Hi Igal,
Thanks for your responses.
Regarding "having a pre-processing job that does whatever transformations
necessary with the DataStream and outputs to Kafka / Kinesis, and then
having a separate StateFun deployment that consumes from that transformed
Kafka / Kinesis topic."
I was wondering how
Hi Max,
Sorry for not being clearer earlier, by splitting the pipeline I mean:
having a pre-processing job that does whatever transformations necessary
with the DataStream and outputs to Kafka / Kinesis,
and then having a separate StateFun deployment that consumes from that
transformed Kafka / Kin
Dear Igal,
Can you elaborate more on your proposed solution of splitting the pipeline?
If possible, providing some skeleton pseudocode would be awesome!
Thanks in advance.
Best,
Max
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
gt; Best regards
> Theo
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/asyncio.html
>
> --
> *Von: *"Elkhan Dadashov"
> *An: *"user"
> *Gesendet: *Donnerstag, 16. April 2020 10:37:55
>
ctions job.
>
> Good luck,
> Igal.
>
> On Wed, Apr 22, 2020 at 4:26 PM Annemarie Burger <
> annemarie.bur...@campus.tu-berlin.de> wrote:
>
>> I was wondering if it is possible to use a Stateful Function within a
>> Flink
>> pipeline. I know they work with
:
> I was wondering if it is possible to use a Stateful Function within a Flink
> pipeline. I know they work with different API's, so I was wondering if it
> is
> possible to have a DataStream as ingress for a Stateful Function.
>
> Some context: I'm working on a streamin
t 10:26 AM Annemarie Burger <
annemarie.bur...@campus.tu-berlin.de> wrote:
> I was wondering if it is possible to use a Stateful Function within a Flink
> pipeline. I know they work with different API's, so I was wondering if it
> is
> possible to have a DataStream as ingress
I was wondering if it is possible to use a Stateful Function within a Flink
pipeline. I know they work with different API's, so I was wondering if it is
possible to have a DataStream as ingress for a Stateful Function.
Some context: I'm working on a streaming graph analytics system, a
16. April 2020 10:37:55
Betreff: How to scale a streaming Flink pipeline without abusing parallelism
for long computation tasks?
Hi Flink users,
I have a basic Flnk pipeline, doing flatmap.
inside flatmap, I get the input, path it to the client library to compute some
result.
That library ex
Hi Flink users,
I have a basic Flnk pipeline, doing flatmap.
inside flatmap, I get the input, path it to the client library to compute
some result.
That library execution takes around 30 seconds to 2 minutes (depending on
the input ) for producing the output from the given input ( it is
time-ser
I guess using a session cluster rather then a job cluster will decouple the
job from the container and may be the only option as of today?
On Sat, Jun 29, 2019, 9:34 AM Vishal Santoshi
wrote:
> So there a re 2 scenerios
>
> 1. If JM goes down ( exits ) and k8s re launches the Job Cluster ( the J
So there a re 2 scenerios
1. If JM goes down ( exits ) and k8s re launches the Job Cluster ( the JM
), it is using the exact command, the JM was brought up in the first place.
2. If the pipe is restarted for any other reason by the JM ( the JM has not
exited but *Job Kafka-to-HDFS (
Another point the JM had terminated. The policy on K8s for Job Cluster is
spec:
restartPolicy: OnFailure
*2019-06-29 00:33:14,308 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Terminating cluster
entrypoint process StandaloneJobClusterEntryPoint with exit code 1443.*
On Sat,
I have not tried on bare metal. We have no option but k8s.
And this is a job cluster.
On Sat, Jun 29, 2019 at 9:01 AM Timothy Victor wrote:
> Hi Vishal, can this be reproduced on a bare metal instance as well? Also
> is this a job or a session cluster?
>
> Thanks
>
> Tim
>
> On Sat, Jun 29, 2
Hi Vishal, can this be reproduced on a bare metal instance as well? Also
is this a job or a session cluster?
Thanks
Tim
On Sat, Jun 29, 2019, 7:50 AM Vishal Santoshi
wrote:
> OK this happened again and it is bizarre ( and is definitely not what I
> think should happen )
>
>
>
>
> The job fai
OK this happened again and it is bizarre ( and is definitely not what I
think should happen )
The job failed and I see these logs ( In essence it is keeping the last 5
externalized checkpoints ) but deleting the zk checkpoints directory
*06.28.2019 20:33:13.7382019-06-29 00:33:13,73
Ok, I will do that.
On Wed, Jun 5, 2019, 8:25 AM Chesnay Schepler wrote:
> Can you provide us the jobmanager logs?
>
> After the first restart the JM should have started deleting older
> checkpoints as new ones were created.
> After the second restart the JM should have recovered all 10 checkpoi
Can you provide us the jobmanager logs?
After the first restart the JM should have started deleting older
checkpoints as new ones were created.
After the second restart the JM should have recovered all 10
checkpoints, start from the latest, and start pruning old ones as new
ones were created.
Any one?
On Tue, Jun 4, 2019, 2:41 PM Vishal Santoshi
wrote:
> The above is flink 1.8
>
> On Tue, Jun 4, 2019 at 12:32 PM Vishal Santoshi
> wrote:
>
>> I had a sequence of events that created this issue.
>>
>> * I started a job and I had the state.checkpoints.num-retained: 5
>>
>> * As expected
The above is flink 1.8
On Tue, Jun 4, 2019 at 12:32 PM Vishal Santoshi
wrote:
> I had a sequence of events that created this issue.
>
> * I started a job and I had the state.checkpoints.num-retained: 5
>
> * As expected I have 5 latest checkpoints retained in my hdfs backend.
>
>
> * JM dies ( K
I had a sequence of events that created this issue.
* I started a job and I had the state.checkpoints.num-retained: 5
* As expected I have 5 latest checkpoints retained in my hdfs backend.
* JM dies ( K8s limit etc ) without cleaning the hdfs directory. The k8s
job restores from the latest che
There is a go cli for automating deploying and udpating flink jobs, you
can integrate Jenkins pipeline with it, maybe it helps.
https://github.com/ing-bank/flink-deployer
Navneeth Krishnan 于2019年4月9日周二 上午10:34写道:
> Hi All,
>
> We have some streaming jobs in production and today we manually de
Hi All,
We have some streaming jobs in production and today we manually deploy the
flink jobs in each region/environment. Before we start automating it I just
wanted to check if anyone has already created a CICD script for Jenkins or
other CICD tools to deploy the latest JAR on to running flink cl
38 matches
Mail list logo