Hi! I was curious if there are docs on how to optimize Flink joins. I
looked around and on the Flink docs and didn't see much. I see a little on
the Configuration page.
E.g. one of my jobs has an interval join. Does left vs right matter for
interval join?
given:
[source] -> [operator 1] -> [operator 2] -> [sink].
If within the dashboard, operator 1 shows that it has backpressure, does
that mean I need to improve the performance of operator 2 in order to
alleviate backpressure upon operator 1?
Great, thank you for help Matthias
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Thanks Matthias, I think it will help to find out what all live keys are
present.
Let me check and revert back on the thread.
On Wed, Feb 10, 2021 at 10:46 PM Matthias Pohl
wrote:
> Hi narashima,
> not sure whether this fits your use case, but have you considered creating
> a savepoint and anal
Hi Barisa,
thanks for sharing this. I'm gonna add Till to this thread. He might have
some insights.
Best,
Matthias
On Wed, Feb 10, 2021 at 4:19 PM Barisa Obradovic wrote:
> I'm trying to understand if behaviour of the flink jobmanager during
> zookeeper upgrade is expected or not.
>
> I'm runni
As in
https://github.com/aws/aws-sdk-java/blob/41a577e3f667bf5efb3d29a46aaf210bf70483a1/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/transfer/TransferManager.java#L2378
never gets called as it is never GCed...
On Wed, Feb 10, 2021 at 10:47 AM Vishal Santoshi
wrote:
> Thank you,
>
> Th
Hi narashima,
not sure whether this fits your use case, but have you considered creating
a savepoint and analyzing it using the State Processor API [1]?
Best,
Matthias
[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/libs/state_processor_api.html#state-processor-api
On Wed, Feb 10,
Hi Daniel,
what's the exact configuration you used? Did you use the resource
definitions provided in the Standalone Flink on Kubernetes docs [1]? Did
you do certain things differently in comparison to the documentation?
Best,
Matthias
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.
It is not solving the problem.
I could see the memory keep increasing, resulting in a lot of high GCs.
There could be a memory leak, just want to know how to know if older keps
are skill alive, even after the pattern has been satisfied or within range
of the pattern has expired.
Can someone sugg
Hi,
have tried using the bundled hadoop uber jar [1]. It looks like some Hadoop
dependencies are missing.
Best,
Matthias
[1] https://flink.apache.org/downloads.html#additional-components
On Wed, Feb 10, 2021 at 1:24 PM meneldor wrote:
> Hello,
> I am using PyFlink and I want to write records f
Hi, I modified the Stateful Functions 2.2.0 asyc example to include a real
binding to kafka, I included statefun-flink-distribution and
stateful-kafka-io in the pom and I created a fat jar using the
maven-assembly-plugin,
and my flink cluster complains about:
java.lang.IllegalStateException: Unab
I'm trying to understand if behaviour of the flink jobmanager during
zookeeper upgrade is expected or not.
I'm running flink 1.11.2 in kubernetes, with zookeeper server 3.5.4-beta.
While I'm doing zookeeper upgrade, there is a 20 seconds zookeeper downtime.
I'd expect to either flink job to restar
FileSystems must not be bundled in the user jar.
You must place them in lib/ or plugins/, because by bundling it you
break our assumption that they exist for the lifetime of the cluster
(which in turn means we don't really have to worry about cleaning up).
On 2/10/2021 4:01 PM, Vishal Santosh
com/amazonaws/services/s3/transfer/TransferManager.class is in
flink-s3-fs-hadoop-1.11.2.jar
which is in the plugins and that AFAIK should have a dedicated ClassLoader
per plugin. So does it make sense that these classes remain beyond the job
and so does the executor service for multipart upload
Hi,
As Kezhu Wang pointed out, this MIGHT BE caused by the
https://issues.apache.org/jira/browse/FLINK-21028 issue.
During stop with savepoint procedure, source thread might be interrupted,
leaving the whole application in an invalid and inconsistent state. In
FLINK-1.12.x one potential symptom i
We do put the flink-hdoop-uber*.jar in the flink lib ( and thus available
to the root classloader ). That still does not explain the executor
service outliving the job.
On Tue, Feb 9, 2021 at 6:49 PM Vishal Santoshi
wrote:
> Hello folks,
> We see threads from
> https://github.c
Hello,
I am using PyFlink and I want to write records from the table sql api as
parquet files on AWS S3. I followed the documentations but it seems that
I'm missing some dependencies or/and configuration. Here is the SQL:
> CREATE TABLE sink_table(
> `id` VARCHAR,
> `type` VARCHAR,
>
> Actually, my use case is that I want to share the state of one stream in
two other streams. Right now, I can think of connecting this stream
independently with each of the two other streams and manage the state
twice, effectively duplicating it.
> Only the matching keys (with the two other strea
Hi,
Yes, but the stream, whose state I want to share, will be indefinite and have a
large volume. Also, not all keys from that stream have to go to every Task
Node. Only the matching keys (with the two other streams) will do.
Please let me know if there is another cleaner way to achieve this.
Hi Dongwon,
With State Processor API you should be able to create a new snapshot that
doesn't reference the unused classes.
Regards,
Roman
On Tue, Feb 9, 2021 at 3:39 AM Dongwon Kim wrote:
> Hi Khachatryan,
>
> Thanks for the explanation and the input!
>
> 1. Use the State Processor API to cr
Right, in this case FileSystemStateBackend is the right choice.
The state size is limited by TM memory as you said.
Regards,
Roman
On Tue, Feb 9, 2021 at 8:54 AM yidan zhao wrote:
> What I am interested in is whether I should use rocksDB to replace
> fileBackend.
> RocksDB's performance is not
21 matches
Mail list logo