Hi Xintong Song,
I tried using the java options to generate heap dump referring to docs[1]
in flink-conf.yaml, however after adding this the task manager containers
are not coming up. Note that I am using EMR. Am i doing anything wrong here?
env.java.opts: "-XX:+HeapDumpOnOutOfMemoryError
-XX:Heap
Does anyone by any chance have a working example (of course without the
credentials etc') that can be shared on github ?simply reading/writing a
file from/to s3.
I keep on struggling with this one and getting weird exceptions
Thanks
On Thu, Mar 4, 2021 at 7:30 PM Avi Levi wrote:
> Sure, This is
Is it possible for an operator to receive two different kinds of
broadcasts?
Is it possible for an operator to receive two different types of streams
and a broadcast? For example, I know there is a KeyedCoProcessFunction, but
is there a version of that which can also receive broadcasts?
Hi Flink community,
You are invited to Improve your data processing skills with the *Beam
College* webinars!
If you know about Apache Beam but haven’t used it in production yet, or
you want to learn best practices to optimize your Beam pipelines, then
Beam College is for you!
Beam College,
Glad to hear it! Thanks for letting us know.
David
On Fri, Mar 5, 2021 at 10:22 PM Roger wrote:
> Confirmed. This worked!
> Thanks!
> Roger
>
> On Fri, Mar 5, 2021 at 12:41 PM Roger wrote:
>
>> Hey David.
>> Thank you very much for your response. This is making sense now. It was
>> confusing b
Thanks Shuiqiang! That's really helpful, we'll give the connectors a try.
On Wed, Mar 3, 2021 at 4:02 AM Shuiqiang Chen wrote:
> Hi Kevin,
>
> Thank you for your questions. Currently, users are not able to defined
> custom source/sinks in Python. This is a greate feature that can unify the
> end
Hello everyone,
I'm looking to run a Pyflink application run in a distributed fashion,
using kubernetes, and am currently facing issues. I've successfully gotten
a Scala Flink Application to run using the manifests provided at [0]
I attempted to run the application by updating the jobmanager comm
Confirmed. This worked!
Thanks!
Roger
On Fri, Mar 5, 2021 at 12:41 PM Roger wrote:
> Hey David.
> Thank you very much for your response. This is making sense now. It was
> confusing because I was able to use the Broadcast stream prior to adding
> the second stream. However, now I realize that th
Rion,
A given JdbcSink can only write to one table, but if the number of tables
involved isn't unreasonable, you could use a separate sink for each table,
and use side outputs [1] from a process function to steer each record to
the appropriate sink.
I suggest you avoid trying to implement a sink.
Hey David.
Thank you very much for your response. This is making sense now. It was
confusing because I was able to use the Broadcast stream prior to adding
the second stream. However, now I realize that this part of the pipeline
occurs after the windowing so I'm not affected the same way. This is
d
This is a watermarking issue. Whenever an operator has two or more input
streams, its watermark is the minimum of watermarks of the incoming
streams. In this case your broadcast stream doesn't have a watermark
generator, so it is preventing the watermarks from advancing. This in turn
is preventing
Hi all,
I’ve been playing around with a proof-of-concept application with Flink to
assist a colleague of mine. The application is fairly simple (take in a
single input and identify various attributes about it) with the goal of
outputting those to separate tables in Postgres:
object AttributeIdent
Hi Timo,
After investigating this further, this is actually non related to
implementing SupportsWatermarkPushdown.
Once I create a TableSchema for my custom source's RowData, and assign it a
watermark (see my example in the original mail), the plan will always
include a LogicalWatermarkAssigner. T
Hi Timo,
Yes I have gone through the link. But for the other metrics documentation has
description.
For example,
numBytesOut - The total number of bytes this task has emitted.
lastCheckpointSize - The total size of the last checkpoint (in bytes).
For the latency metrics I don't see such descr
Hello.
I am having an issue with a Flink 1.8 pipeline when trying to consume
broadcast state across multiple operators. I currently
have a working pipeline that looks like the following:
records
.assignTimestampsAndWatermarks(
new BoundedOutOfOrdernessGenerator(
Long.parseLong(proper
Trying to spin up a Python Flink instance in my Kubernetes cluster with
this configuration ...
sudo ./bin/flink run \
--target kubernetes-session \
-Dkubernetes.cluster-id=flink-python \
-Dkubernetes.namespace=cmdaa \
-Dkubernetes.container.image=cmdaa/pyflink:0.0.1 \
--pyModule word_count \
--pyF
Thanks Roman and Yuan for your work and driving the release process :)
pt., 5 mar 2021 o 15:53 Till Rohrmann napisał(a):
> Great work! Thanks a lot for being our release managers Roman and Yuan and
> to everyone who has made this release possible.
>
> Cheers,
> Till
>
> On Fri, Mar 5, 2021 at 10
Yes, that might be an issue. As far as I remember, the universal connector
works with Kafka 0.10.x or higher.
Piotrek
pt., 5 mar 2021 o 11:20 Witzany, Tomas
napisał(a):
> Hi,
> thanks for your answer. It seems like it will not be possible for me to
> upgrade to the newer universal Flink produce
Yes, it might be the case. Hard to tell for sure without looking at the
job, metrics etc. Just be mindful of what I described, and if you want to
fine tune a job and set different parallelism values for different
operators, pay attention to where those operators are being distributed.
Usually in pr
Hi Yuval,
sorry that nobody replied earlier. Somehow your email fell through the
cracks.
If I understand you correctly, could would like to implement a table
source that implements both `SupportsWatermarkPushDown` and
`SupportsFilterPushDown`?
The current behavior might be on purpose. Filt
Hello there,
I am running Flink 1.11.3 on Kubernetes deployment. If I change a
setting and re-deploy my Flink setup, the new setting is correctly
applied in the config file but is not being honored by Flink. In other
words, I can ssh into the pod and check the config file - it has the
new setting a
Hi Shilpa,
Shuiqiang is right. Currently, we recommend to use SQL DDL until the
connect API is updated. See here:
https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sql/create/#create-table
Especially the WATERMARK section shows how to declare a rowtime attribute.
Regards,
Hi Suchithra,
did you see this section in the docs?
https://ci.apache.org/projects/flink/flink-docs-stable/ops/metrics.html#latency-tracking
Regards,
Timo
On 05.03.21 15:31, V N, Suchithra (Nokia - IN/Bangalore) wrote:
Hi,
I am using flink 1.12.1 version and trying to explore latency metrics
I don't know how the resulting plan for you query looks like. You can
print it via `env.sqlQuery().explain()`. But I could imagine that by
simplifying the query you would also simplify the number of retraction
messages/operators in the pipeline.
Regards,
Timo
On 05.03.21 13:28, Yik San Chan
Thanks for this proposal Guowei. +1 for it.
Concerning the default size, maybe we can run some experiments and see how
the system behaves with different pool sizes.
Cheers,
Till
On Fri, Mar 5, 2021 at 2:45 PM Stephan Ewen wrote:
> Thanks Guowei, for the proposal.
>
> As discussed offline alrea
Great work! Thanks a lot for being our release managers Roman and Yuan and
to everyone who has made this release possible.
Cheers,
Till
On Fri, Mar 5, 2021 at 10:43 AM Yuan Mei wrote:
> Cheers!
>
> Thanks, Roman, for doing the most time-consuming and difficult part of the
> release!
>
> Best,
>
Hi,
I am using flink 1.12.1 version and trying to explore latency metrics with
Prometheus. I have enabled latency metrics by adding "metrics.latency.interval:
1" in flink-conf.yaml.
I have submitted a flink streaming job which has Source->flatmap->process->sink
which is chained into single task
Thanks Guowei, for the proposal.
As discussed offline already, I think this sounds good.
One thought is that 16m sounds very small for a default read buffer pool.
How risky do you think it is to increase this to 32m or 64m?
Best,
Stephan
On Fri, Mar 5, 2021 at 4:33 AM Guowei Ma wrote:
> Hi, a
Hi Timo,
If I understand correctly, the UDF only simplifies the query, but not doing
anything functionally different. Please correct me if I am wrong, thank you!
Best,
Yik San
On Thu, Mar 4, 2021 at 8:34 PM Timo Walther wrote:
> Yes, implementing a UDF might be the most convenient option for s
Hi,
thanks for your answer. It seems like it will not be possible for me to upgrade
to the newer universal Flink producer, because of an older Kafka version I am
reading from. So unfortunately for now I will have to go with the hack.
Thanks
From: Piotr Nowojski
S
Cheers!
Thanks, Roman, for doing the most time-consuming and difficult part of the
release!
Best,
Yuan
On Fri, Mar 5, 2021 at 5:41 PM Roman Khachatryan wrote:
> The Apache Flink community is very happy to announce the release of Apache
> Flink 1.12.2, which is the second bugfix release for th
The Apache Flink community is very happy to announce the release of Apache
Flink 1.12.2, which is the second bugfix release for the Apache Flink 1.12
series.
Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
Hi Hemant,
This exception generally suggests that JVM is running out of heap memory.
Per the official documentation [1], the amount of live data barely fits
into the Java heap having little free space for new allocations.
You can try to increase the heap size following these guides [2].
If a mem
Hi deepthi,
Thanks for trying the Kubernetes HA service.
> Do I need standby JobManagers?
I think the answer is based on your production requirements. Usually, it is
unnecessary to have more than one JobManagers.
Because we are using the Kubernetes deployment to manage the JobManager.
Once it cra
Hi,
Getting the below OOM but the job failed 4-5 times and recovered from there.
j
*ava.lang.Exception: java.lang.OutOfMemoryError: GC overhead limit
exceededat
org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.checkThrowSourceExecutionException(S
Hey Piotr,
thanks for your answer, that makes perfect sense. However, when looking at the number of messages being processed, we can see that both subtasks on task 2 will produce the same amount of messages in the (1-2-1-1-1) scenario, even with the first task hitting backpressure. We assume t
36 matches
Mail list logo