Hi Dan,
> I've tried playing around with parallelism and resources. It does help.
Glad to hear your problem is solved 😀.
> Does Flink have special logic with the built in interval join code that
impacts how kafka data sources are read?
No. If you said the way I mentioned in the last email, I mean
Thanks JING and Caizhi!
Yea, I've tried playing around with parallelism and resources. It does
help.
We have our own join operator that acts like an interval join (with fuzzy
matching). We wrote our own KeyedCoProcessFunction and modeled it closely
after the internal interval join code. Does F
Hi Yun,
Thanks a lot for your reply!
Regarding the parallelism, I think `table.exec.resource.default-parallelism`
that you mentioned is a good
alternative, but it requires a set operation before running each query. And
since it’s a `default` value, I
suppose there should be an option with highe
Thanks @Till Rohrmann for starting this discussion
Firstly, I try to understand the benefit of shorter heartbeat timeout.
IIUC, it will make the JobManager aware of
TaskManager faster. However, it seems that only the standalone cluster
could benefit from this. For Yarn and
native Kubernetes depl
Hi!
You'll need to set
props.put("schema.registry.basic.auth.user.info",
":");
tkg_cangkul 于2021年7月21日周三 上午12:06写道:
> Hi,
>
>
> i'm trying to connect to kafka with schema registry that using SSL with
> flink 1.11.2
>
> i've got this error message when i try to submit the job.
>
> Caused by: io
Hi!
Streaming joins will not throw away records in the state unless it exceeds
the TTL. Have you tried increasing the parallelism of join operators (and
maybe decrease the parallelism of the large Kafka source)?
Dan Hill 于2021年7月21日周三 上午4:19写道:
> Hi. My team's flink job has cascading interval
Hi!
For this use case it seems that we'll either have to use a custom connector
for that external database (if currently there is no such connector in
Flink), or have to first read the data into some other sources supported by
Flink.
Sweta Kalakuntla 于2021年7月21日周三 上午10:39写道:
> That is my unders
That is my understanding as well, but wanted to confirm.
My use case is I need to get additional data from an external database(not
Source) for the data I am reading off of Kafka Source and aggregate data
for a 10min window for a given key. But I did not want to query for every
key in that 10min w
Hi, sorry for resurrecting the old thread, but I have precisely the same issue
with a different filesystem.
I've tried using plugins dir, setting FLINK_PLUGINS_DIR, etc. - nothing works
locally. I added a breakpoint to the PluginConfig.getPluginsDir method and
confirmed it's not even called.
Hi. My team's flink job has cascading interval joins. The problem I'm
outlining below is fine when streaming normally. It's an issue with
backfills. We've been running into a bunch of backfills to evaluate the
job over older data.
When running as backfills, I've noticed that sometimes one of d
It's after a checkpoint failure. I don't know if that includes a restore
from a checkpoint.
I'll take some screenshots when the jobs hit the failure again. All of my
currently running jobs are healthy right now and haven't hit a checkpoint
failure.
On Sun, Jul 18, 2021 at 11:34 PM Dawid Wysakow
Hi,
i'm trying to connect to kafka with schema registry that using SSL with
flink 1.11.2
i've got this error message when i try to submit the job.
Causedby:
io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException:
Unauthorized; error code: 401
Anyone can help me to
Hi Alexis,
I hope I'm not stating the obvious, but have you checked this documentation
page:
https://ci.apache.org/projects/flink/flink-docs-master/docs/ops/debugging/debugging_classloading/#unloading-of-dynamically-loaded-classes-in-user-code
In particular the shutdown hooks we've introduced in F
+1 to this change!
When I was working on the reactive mode blog post [1] I also ran into this
issue, leading to a poor "out of the box" experience when scaling down.
For my experiments, I've chosen a timeout of 8 seconds, and the cluster has
been running for 76 days (so far) on Kubernetes.
I also
Igal,
Thanks for the answers. Is there any JS SDK available?
Best,
--Omid
On Tue, Jul 20, 2021 at 10:23 AM Igal Shilman wrote:
> Hi Omid,
>
> I'm glad to hear that you are evaluating StateFun in your company! let me
> try to answer your questions:
>
> 1. In version 2.x, StateFun only supported
Hi Omid,
I'm glad to hear that you are evaluating StateFun in your company! let me
try to answer your questions:
1. In version 2.x, StateFun only supported messages of type
com.google.protobuf.Any, and we had a tiny optimization that
reads type hints and unpacked the real message out of the Any m
I guess this is unlikely, because nobody is working on the mentioned
tickets.
I mentioned your request in the ticket, to raise awareness again, but I
would still consider it unlikely.
On Tue, Jul 20, 2021 at 1:57 PM Wanghui (HiCampus)
wrote:
> Hi Robert:
>
> Thank you for your reply.
>
Hi Prasanna,
which Flink version and Kafka connector are you using? (the "KafkaSource"
or "FlinkKafkaConsumer"?)
The partition assignment for the FlinkKafkaConsumer is defined here:
https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/st
Hi Robert:
Thank you for your reply.
Will this feature be released in version 1.14?
Best,
Hui
发件人: Robert Metzger [mailto:rmetz...@apache.org]
发送时间: 2021年7月20日 19:45
收件人: Wanghui (HiCampus)
抄送: user@flink.apache.org
主题: Re: Some question of RocksDB state backend on ARM os
The RocksDB ver
The RocksDB version provided by Flink does not currently run on ARM.
However, there are some efforts / hints:
- https://stackoverflow.com/a/44573013/568695
- https://issues.apache.org/jira/browse/FLINK-13448
- https://issues.apache.org/jira/browse/FLINK-13598
I would recommend voting and commenti
Your understanding of the problem is correct -- the serialization cost is
the reason for the high CPU usage.
What you can also try to optimize is the serializers you are using (by
using data types that are efficient to serialize). See also this blog post:
https://flink.apache.org/news/2020/04/15/f
Are you using remote disks for rocksdb? (I guess that's EBS on AWS) Afaik
there are usually limitations wrt to the IOPS you can perform.
I would generally recommend measuring where the bottleneck is coming from.
It could be that your CPUs are at 100%, then adding more machines / cores
will help (m
22 matches
Mail list logo