Congrats, thanks for the release manager work Zhu Zhu and everyone involved in!
Best,
Zhijiang
--
From:liupengcheng
Send Time:2020年8月26日(星期三) 19:37
To:dev ; Xingbo Huang
Cc:Guowei Ma ; user-zh ; Yangze
Guo ; Dian Fu ; Zhu Zhu
Congrats, Dian!
--
From:Yun Gao
Send Time:2020年8月27日(星期四) 17:44
To:dev ; Dian Fu ; user
; user-zh
Subject:Re: Re: [ANNOUNCE] New PMC member: Dian Fu
Congratulations Dian !
Best
Yun
---
custom partitioner which can
control the logic of keyBy distribution based on pre-defined cache distribution
in nodes?
Best,
Zhijiang
--
From:Navneeth Krishnan
Send Time:2020年9月23日(星期三) 02:21
To:user
Subject:Adaptive load balancing
ssure, it should go down to
milliseconds.
Best,
Zhijiang
--
From:Felipe Gutierrez
Send Time:2020 Mar. 6 (Fri.) 05:04
To:user
Subject:Backpressure and 99th percentile latency
Hi,
I am a bit confused about the topic of tracking la
be increased along with the trend of increased
input&outputQueueLength and input&outputPoolUsage. All of them should be
proportional to have the same trend in most cases.
Best,
Zhijiang
--
From:Felipe Gutierrez
Send
e from the
wrapped real exception,
then users can easily get the root cause directly, not only for the current
message "Could not forward element to next operator".
Best,
Zhijiang
--
From:Tzu-Li (Gordon) Tai
Send Time:20
skMetricGroup.
Hope it solve your problem.
Best,
Zhijiang
--
From:Felipe Gutierrez
Send Time:2020 Mar. 17 (Tue.) 17:50
To:user
Subject:Re: How do I get the outPoolUsage value inside my own stream operator?
Hi,
just for the recor
Hi Humberto,
I guess Fuji is familiar with Flink CEP and he can answer your proposed
question. I already cc him.
Best,
Zhijiang
--
From:Humberto Rodriguez Avila
Send Time:2020 Mar. 18 (Wed.) 17:31
To:user
Subject:FlinkCEP
Hi Lu,
Besides Congxian's replies, you can also get some further explanations from
"https://flink.apache.org/2019/07/23/flink-network-stack-2.html#latency-tracking";.
Best,
Zhijiang
--
From:Congxian Qiu
Send Ti
Thanks for the continuous efforts for engaging in Flink ecosystem Jeff!
Glad to see the progressive achievement. Wish more users try it out in practice.
Best,
Zhijiang
--
From:Dian Fu
Send Time:2020 Mar. 31 (Tue.) 10:15
To:Jeff
Great work! Thanks Gordon for the continuous efforts for enhancing stateful
functions and the efficient release!
Wish stateful functions becoming more and more popular in users.
Best,
Zhijiang
--
From:Yun Tang
Send Time:2020 Apr
Thanks Dian for the release work and thanks everyone involved.
Best,
Zhijiang
--
From:Till Rohrmann
Send Time:2020 Apr. 27 (Mon.) 15:13
To:Jingsong Li
Cc:dev ; Leonard Xu ; Benchao Li
; Konstantin Knauf ; jincheng
sun ; Hequn
work related
metrics.
Whether there are data skew in your case, that means some task would process
more records than others. If so, maybe we can increase the parallelism to
balance the load.
Best,
Zhijiang
--
From:Senthil Kumar
Send
Thanks Yu for the release manager and everyone involved in.
Best,
Zhijiang
--
From:Arvid Heise
Send Time:2020年5月18日(星期一) 23:17
To:Yangze Guo
Cc:dev ; Apache Announce List ;
user ; Yu Li ; user-zh
Subject:Re: [ANNOUNCE] Apache
am not sure why you choose rescale to shuffle
data among operators. The default
forward mode can gain really good performance by default if you adjusting the
same parallelism among them.
Best,
Zhijiang
--
From:Weihua Hu
Send Time
Regarding the monitor of backpressure, you can refer to the document [1].
As for debugging the backpressure, one option is to trace the jstack of
respective window task thread which causes the backpressure(almost has the
maximum inqueue buffers).
After frequent tracing the jstack, you might find
Sorry for missing the document link [1]
https://ci.apache.org/projects/flink/flink-docs-release-1.10/monitoring/back_pressure.html
--
From:Zhijiang
Send Time:2020年6月11日(星期四) 11:32
To:Steven Nelson ; user
Subject:Re: Flink 1.10 mem
n" operator is
more than 0? I want to get ride of the factors of buffer leak on upstream side
and without partition request on downstream side. Then we can further allocate
whether
the input availability notification on downstream s
Congratulations Yu! Well deserved!
Best,
Zhijiang
--
From:Dian Fu
Send Time:2020年6月17日(星期三) 10:48
To:dev
Cc:Haibo Sun ; user ; user-zh
Subject:Re: [ANNOUNCE] Yu Li is now part of the Flink PMC
Congrats Yu!
Regards,
Dian
>
using the first config form. But somehow they seem two different dimensions
for config the checkpoint. One is for the semantic of data processing
guarantee. And the other is for how we realize two different mechanisms to
guarantee one (exactly-once) of the semantics.
Best,
Zhijiang
:2020年6月22日(星期一) 10:53
To:Zhijiang ; user@flink.apache.org
Subject:回复: Unaligned Checkpoint and Exactly Once
Thank you Zhijiang! The second question about config is just because I find a
method in InputProcessorUtil. I guess AT_LEAST_ONCE mode is a simpler way to
handle checkpont barrier
://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12346364
We would like to thank all contributors of the Apache Flink community who made
this release possible!
Cheers,
Piotr & Zhijiang
Thanks for being the release manager and the efficient work, Dian!
Best,
Zhijiang
--
From:Konstantin Knauf
Send Time:2020年7月22日(星期三) 19:55
To:Till Rohrmann
Cc:dev ; Yangze Guo ; Dian Fu
; user ; user-zh
Subject:Re: [ANNOUNCE
your acception or not. If the performance is not reaching your requirements, we
could further consider other improvements.
Best,
Zhijiang
--
From:Piotr Nowojski
Send Time:2019年8月6日(星期二) 14:55
To:黄兆鹏
Cc:user
Subject:Re: Will
Congratulations Andrey, great work and well deserved!
Best,
Zhijiang
--
From:Till Rohrmann
Send Time:2019年8月14日(星期三) 15:26
To:dev ; user
Subject:[ANNOUNCE] Andrey Zagrebin becomes a Flink committer
Hi everyone,
I'm very hap
Congratulations Zili!
--
From:Becket Qin
Send Time:2019年9月12日(星期四) 03:43
To:Paul Lam
Cc:Rong Rong ; dev ; user
Subject:Re: [ANNOUNCE] Zili Chen becomes a Flink committer
Congrats, Zili!
On Thu, Sep 12, 2019 at 9:39 AM Paul Lam wr
x27;s metric of
`Shuffle.Netty.Input.Buffers.inputQueueLength` on preAggregate side, you might
rely on some external metric reporter to query it if possible.
Best,
Zhijiang
--
From:Felipe Gutierrez
Send Time:2019 Nov. 5 (Tue.
lure which triggers the following cancel operations.
In addition, which flink version are you using?
Best,
Zhijiang
--
From:Felipe Gutierrez
Send Time:2019 Nov. 6 (Wed.) 19:12
To:user
Subject:What metrics can I see the root cause
monitoring/rest_api.html
Best,
Zhijiang
--
From:Felipe Gutierrez
Send Time:2019 Nov. 7 (Thu.) 00:06
To:Chesnay Schepler
Cc:Zhijiang ; user
Subject:Re: How can I get the backpressure signals inside my function or
operator?
If I can t
stack, especially it really spans several
releases.
Best,
Zhijiang
--
From:Hailu, Andreas
Send Time:2019 Nov. 21 (Thu.) 01:03
To:user@flink.apache.org
Subject:RE: CoGroup SortMerger performance degradation from 1.6.4 - 1.9.1
.
// ah
From: Piotr Nowojski On Behalf Of Piotr Nowojski
Sent: Thursday, November 21, 2019 10:14 AM
To: Hailu, Andreas [Engineering]
Cc: Zhijiang ; user@flink.apache.org
Subject: Re: CoGroup SortMerger performance degradation from 1.6.4 - 1.9.1?
Hi,
I would suspect this:
https://issues.apache.org
increase your task manager memory.
But if you can analyze the dump hs_err file via some profiler tool for checking
the memory usage, it might be more helpful to find the root cause.
Best,
Zhijiang
--
From:John Smith
Send Time:2019 Dec
.
--
From:John Smith
Send Time:2019 Dec. 25 (Wed.) 03:40
To:Zhijiang
Cc:user
Subject:Re: Flink task node shut it self off.
The shutdown happened after the massive IO wait. I don't use any state
Checkpoints are disk based...
On Mon., Dec. 23, 2019, 1:42 a.m. Zhijiang, wrote:
Hi
not support
such operation/function atm. :)
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/event_timestamps_watermarks.html
Best,
Zhijiang
--
From:邢瑞斌
Send Time:2019 Dec. 25 (Wed.) 20:27
To:user-zh ; user
addition, the StreamingFileSink also implements the exactly-once for sink.
You might also refer to it to get some insights if possible.
Best,
Zhijiang
--
From:Joe Hansen
Send Time:2019 Dec. 26 (Thu.) 01:42
To:user
Subject:Aggregating
rming whether it is
happening, but maybe it is not very convenient.
Another possible way is via the checkpoint metrics which would record the
sync/async duration time, maybe it can also satisfy your requirements.
Best,
Zhi
implemented some enhancements in scheduler layer
to support such requirement in release-1.10. You can have a try when the rc
candidate is ready.
Best,
Zhijiang
--
From:杨东晓
Send Time:2020 Jan. 10 (Fri.) 02:10
To:Congxian Qiu
Cc:user
Hi Vishwas,
I guess this link [1] can help understand how it works and how to use in
practice for StreamingFileSink.
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/connectors/streamfile_sink.html
Best,
Zhijiang
Really great work and thanks everyone involved, especially for the release
managers!
Best,
Zhijiang
--
From:Kurt Young
Send Time:2020 Feb. 13 (Thu.) 11:06
To:[None]
Cc:user ; dev
Subject:Re: [ANNOUNCE] Apache Flink 1.10.0 released
Best,
Zhijiang
--
From:张光辉
Send Time:2020 Feb. 12 (Wed.) 22:19
To:Benchao Li
Cc:刘建刚 ; user
Subject:Re: Encountered error while consuming partitions
Network can fail in many ways, sometimes pretty subtle (e.g. high ratio packet
If the id is not consistent in different parts, maybe it is worth creating a
jira ticket for better improving the user experience.
If anyone wants to work on it, please ping me then I can give a hand.
Best,
Zhijiang
--
From:Yun Tang
Let's move the further discussion onto the jira page.
I have not much time recently for working on this. If you want to take it, I
can assign it to you and help review the PR if have time then. Or I can find
other possible guys work on it future.
Best,
Zhi
BTW, the FLIP-75 is going for the user experience of web UI.
@Yadong Xiehave we already considered this issue to unify the ids in different
parts in FLIP-75?
Best,
Zhijiang
--
From:Zhijiang
Send Time:2020 Feb. 14 (Fri.) 13:03
Congrats Jingsong! Welcome on board!
Best,
Zhijiang
--
From:Zhenghua Gao
Send Time:2020 Feb. 21 (Fri.) 12:49
To:godfrey he
Cc:dev ; user
Subject:Re: [ANNOUNCE] Jingsong Lee becomes a Flink committer
Congrats Jingsong!
Best
ld be sent to the required peers during task schedule and
deployment.
Best,
Zhijiang
--
From:KristoffSC
Send Time:2020 Feb. 26 (Wed.) 19:39
To:user
Subject:Re: How JobManager and TaskManager find each other?
Thanks all for the an
metrics are `0` in light-weight situation,
which i mentioned above. So we can not estimate the saturation unless we
increase the source emit.
Wish good news sharing from you!
Best,
Zhijiang
--
From:Arvid Heise
Send Time:2020 Feb
output buffers are not consumed by downstream tasks.
I think you can check the downstream task which inqueue usage should reach
100%, then jstack the corresponding downstream tasks that may stuck in some
operations to cause back pressure.
Best,
Zhijiang
,
Zhijiang
--
发件人:Akshay Mendole
发送时间:2018年11月22日(星期四) 13:43
收件人:user
主 题:OutOfMemoryError while doing join operation in flink
Hi,
We are converting one of our pig pipelines to flink using apache beam. The
pig pipeline reads two
So you can decrease the
"taskmanager.memory.fraction" in low fraction or increase the total task
manager to cover this overhead memories, or set one slot for each task manager.
Best,
Zhijiang
--
发件人:Akshay Mendole
发送时间
I think it is probably related with rockdb memory usage if you have not found
OutOfMemory issue before.
There already existed a jira ticket [1] for fixing this issue, and you can
watch it for updates. :)
[1] https://issues.apache.org/jira/browse/FLINK-10884
Best,
Zhijiang
It may work aournd by increasing the task manager memory size.
The recover failure is up to serveral issues, whether it had successful
checkpoint before, the states are available and what is the failover strategy?
Best,
Zhijiang
Hi Shuang,
Normally this exception you mentioned is not the root cause of failover, and it
is mainly caused by cancel process to make task exit.
You can further check whether there are other failures in job master log to
find the root cause.
Best,
Zhijiang
B is
empty.
Best,
Zhijiang
--
From:Gagan Agrawal
Send Time:2019年1月7日(星期一) 12:06
To:user
Subject:Buffer stats when Back Pressure is high
Hi,
I want to understand does any of buffer stats help in debugging / validating
that
number of netty thread and timeout should make sense
for normal cases.
Best,
Zhijiang
--
From:Wenrui Meng
Send Time:2019年1月9日(星期三) 18:18
To:Till Rohrmann
Cc:user ; Konstantin
Subject:Re: ConnectTimeoutException when
Congrats Thomas!
Best,
Zhijiang
--
From:Kostas Kloudas
Send Time:2019年2月12日(星期二) 22:46
To:Jark Wu
Cc:Hequn Cheng ; Stefan Richter
; user
Subject:Re: [ANNOUNCE] New Flink PMC member Thomas Weise
Congratulations Thomas!
Best
the interested one and handle the progress of it.
Best,
Zhijiang
--
From:Jeff Zhang
Send Time:2019年2月14日(星期四) 18:03
To:Stephan Ewen
Cc:dev ; user ; jincheng sun
; Shuyi Chen ; Rong Rong
Subject:Re: [DISCUSS] Adding a mid-term
implementation prefix with `akka` and the
other is flink internal implementation.
Best,
Zhijiang
--
From:sohimankotia
Send Time:2019年2月18日(星期一) 14:40
To:user
Subject:Confusion in Heartbeat configurations
Hi,
In
https://ci.apache.org
task TPS are not
decreased for a period as before, I think we could confirm the above analysis.
:)
Best,
Zhijiang
--
From:Paul Lam
Send Time:2019年2月28日(星期四) 15:17
To:user
Subject:Flink performance drops when async checkpoint is
is suitable for your scenario and may have a try.
Best,
Zhijiang
--
From:LINZ, Arnaud
Send Time:2019年2月28日(星期四) 17:28
To:user
Subject:Checkpoints and catch-up burst (heavy back pressure)
Hello,
I have a simple streaming app that get da
which operation delays the task to cause the
backpressure, and this operation might be involved with HDFS. :)
Best,
Zhijiang
--
From:Paul Lam
Send Time:2019年2月28日(星期四) 19:17
To:zhijiang
Cc:user
Subject:Re: Flink performance drops
one
single downstream task (`a` is the parallelism of source vertex), because it is
all-to-all connection. The barrier alignment takes more time in rebalance mode
than forward mode.
Best,
Zhijiang
--
From:LINZ, Arnaud
Send Time
queued in front of
barriers. This is the right way to try and wish your solution with 2
parameters work.
Best,
Zhijiang
--
From:LINZ, Arnaud
Send Time:2019年3月2日(星期六) 16:45
To:zhijiang ; user
Subject:RE: Checkpoints and catch-up
what is the
flush timeout you config. Also you can trace the current metrics of
outqueue.usages|length and inqueue.usags|length to find something.
Best,
Zhijiang
--
From:Brian Ramprasad
Send Time:2019年3月12日(星期二) 03:47
To:user
Cool!
Finally see the FLINK 1.8.0 release. Thanks Aljoscha for this excellent work
and efforts for other contributors.
We would continue working hard for FLINK 1.9.0
Best,
Zhijiang
--
From:vino yang
Send Time:2019年4月10日(星期三) 17
ted.
Best,
Zhijiang
--
From:Xiangfeng Zhu
Send Time:2019年4月12日(星期五) 08:03
To:user
Subject:Question regarding "Insufficient number of network buffers"
Hello,
My name is Allen, and I'm currently researching different distr
Hi Wenrui,
I think the akka gated issue and inactive netty channel are both caused by some
task manager exits/killed. You should double check the status and reason of
this task manager `'athena592-phx2/10.80.118.166:44177'`.
Best
restarted.
Best,
Zhijiang
--
From:Peter Zende
Send Time:2019年4月14日(星期日) 00:25
To:user
Subject:Retain metrics counters across task restarts
Hi all
We're exposing Prometheus metrics from our Flink (v1.7.1) pipeline to
Prome
Hi Henry,
The backpressure tracking is not realized in metric framework, you could check
the details via [1]. I am not sure why your requirements is showing
backpressure in metrics.
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.8/monitoring/back_pressure.html
Best,
Zhijiang
length/usage on
consumer side. Although it is not very accurate sometimes, it could provide
some hints of backpressure, because the outqueue and inqueue should be filled
with buffers between producer and consumer when backpressure occurs.
Best,
Zhijiang
Hi Wenrui,
You might further check whether there exists network connection issue between
job master and target task executor if you confirm the target task executor is
still alive.
Best,
Zhijiang
--
From:Biao Liu
Send Time:2019年4
Hi Wenrui,
I think you could trace the log of node manager which contains the lifecycle of
this task executor. Maybe this task executor is killed by node manager because
of memory overuse.
Best,
Zhijiang
--
From:Wenrui Meng
Send
one task triggers
another task to release memory in the same TM. Or you could increase the
network buffer setting to work aournd, but not sure this way could work for
your case because it is up to the total data size the source produced.
Best,
Zhijiang
bugs in previous
flink versions.
[1] https://issues.apache.org/jira/browse/FLINK-12544
Best,
Zhijiang
--
From:Narayanaswamy, Krishna
Send Time:2019年5月17日(星期五) 19:00
To:zhijiang ; Aljoscha Krettek
; Piotr Nowojski
Cc:Nico Kruber
Hi Krishna,
Could you show me or attach the jstack for the single slot case? Or is it the
same jstack as before?
Best,
Zhijiang
--
From:Narayanaswamy, Krishna
Send Time:2019年5月21日(星期二) 19:50
To:zhijiang ; Aljoscha Krettek
,
Zhijiang
--
From:Narayanaswamy, Krishna
Send Time:2019年5月22日(星期三) 00:49
To:zhijiang ; Aljoscha Krettek
; Piotr Nowojski
Cc:Nico Kruber ; user@flink.apache.org
; "Chan, Regina" ; "Erai, Rahul"
Subject:RE
for review atm. You could pick the code in PR to verfiy the results if you
like. And the next release-1.8.1 might cover this fix as well.
Best,
Zhijiang
--
From:Erai, Rahul
Send Time:2019年6月4日(星期二) 15:50
To:zhijiang ; Aljoscha
ishna"
Cc:Nico Kruber ; user@flink.apache.org
; "Chan, Regina"
Subject:RE: 回复:Memory Allocate/Deallocate related Thread Deadlock encountered
when running a large job > 10k tasks
Thanks Zhijiang.
Can you point us to the JIRA for your fix?
Regards,
-Rahul
From: zhijiang
Sent: Tue
It is reasonable as stephan explained. +1 from my side!
--
From:Jeff Zhang
Send Time:2019年6月11日(星期二) 22:11
To:Stephan Ewen
Cc:user ; dev
Subject:Re: [DISCUSS] Deprecate previous Python APIs
+1
Stephan Ewen 于2019年6月11日周二 下午9:30写道
could implement a custom MetricReporter, and then only consentrate on
your required application metrics in the method of
`MetricReporter#notifyOfAddedMetric` to show them in backend.
Best,
Zhijiang
--
From:M Singh
Send Time:2019年6月
you could double check this dir for the issue.
In addition I suggestt you upgrading the flink version because flink-1.3.3 is
too old. After upgrading to flink-1.5 above, you do not need to consider this
issue, because the exactly-once mode would not spill data to disk any more.
Best,
Zhijiang
/browse/FLINK-10462
Best,
Zhijiang
--
From:aitozi
Send Time:2019年6月16日(星期日) 22:19
To:user
Subject:A little doubt about the blog A Deep To Flink's NetworkStack
Hi, community
I read this blog A Deep To Flink's NetworkSt
As long as one task is in canceling state, then the job status might be still
in canceling state.
@Joshua Do you confirm all of the tasks in topology were already in terminal
state such as failed or canceled?
Best,
Zhijiang
to check the task final state.
Best,
Zhijiang
--
From:Joshua Fan
Send Time:2019年6月20日(星期四) 11:55
To:zhijiang
Cc:user ; Till Rohrmann ; Chesnay
Schepler
Subject:Re: Maybe a flink bug. Job keeps in FAILING state
zhijiang
I did not
exit to solve the potential issue.
Best,
Zhijiang
--
From:Chesnay Schepler
Send Time:2019年6月21日(星期五) 16:34
To:zhijiang ; Joshua Fan
Cc:user ; Till Rohrmann
Subject:Re: Maybe a flink bug. Job keeps in FAILING state
The logs are at
final decision.
Best,
Zhijiang
--
From:Joshua Fan
Send Time:2019年6月25日(星期二) 11:10
To:zhijiang
Cc:Chesnay Schepler ; user ; Till
Rohrmann
Subject:Re: Maybe a flink bug. Job keeps in FAILING state
Hi Zhijiang
Thank you for your
Congratulations Rong!
Best,
Zhijiang
--
From:Kurt Young
Send Time:2019年7月11日(星期四) 22:54
To:Kostas Kloudas
Cc:Jark Wu ; Fabian Hueske ; dev
; user
Subject:Re: [ANNOUNCE] Rong Rong becomes a Flink committer
Congratulations Rong
Hi Osh,
As I know, currently one dataset source can not be consumed by several
different vertexs and from the API you can not construct the topology for your
request.
I think your way to merge different reduce functions into one UDF is feasible.
Maybe someone has better solution. :)
zhijiang
whether and where caused the OOM. Maybe check the task
failure logs.
Zhijiang
--
发件人:gerardg
发送时间:2018年6月30日(星期六) 00:12
收件人:user
主 题:Re: Flink job hangs/deadlocks (possibly related to out of memory)
(fixed formatting)
Hello
to
trigger restarting the job.
Zhijiang
--
发件人:Gerard Garcia
发送时间:2018年7月2日(星期一) 18:29
收件人:wangzhijiang999
抄 送:user
主 题:Re: Flink job hangs/deadlocks (possibly related to out of memory)
Thanks Zhijiang,
We haven't found any
taskmanager.network.memory.floating-buffers-per-gate.
If you have other questions about them, let me know then i can explain for you.
Zhijiang
--
发件人:Vishal Santoshi
发送时间:2018年7月5日(星期四) 22:28
收件人:user
主 题:Limiting in flight data
"Yes,
not cause OOM). I think you should not worry
about that. Normally it is better to consider TPS of both sides and set the
proper paralellism to avoid back pressure to some extent.
Zhijiang
--
发件人:Mich Talebzadeh
发送时间:2018年7月4日(星期三
trics for some helps.
--
发件人:Vishal Santoshi
发送时间:2018年7月6日(星期五) 22:05
收件人:Zhijiang(wangzhijiang999)
抄 送:user
主 题:Re: Limiting in flight data
Further if there is are metrics that allows us to chart delays per pipe on n/w
buffers, that would be immensely help
framework. Also you can monitor the gc status to check the full gc delay.
Best,
Zhijiang
--
发件人:Gerard Garcia
发送时间:2018年7月13日(星期五) 16:22
收件人:wangzhijiang999
抄 送:user
主 题:Re: Flink job hangs/deadlocks (possibly related to out of m
for lock which is also
occupied by task output process.
As you mentioned, it makes sense to check the data structure of the output
record and reduces the size or make it lightweight to handle.
Best,
Zhijiang
--
发件人:Gerard Garcia
askManager received the task deployment delayed from
JobManager, or some operations in upstream task initialization unexpectly cost
more time before registering result partition.
Best,
Zhijiang
--
发件人:Steffen Wohlers
发送时间:2018年7月22日(星期日)
buffers in record
serializers. If the record size is large and the downstream parallelism is
large, it may cause OOM issue in serialization.
Could you show the stack of OOM part? If it is this case, the following [1]
can solve it and it is working in progress.
Zhijiang
[1] https
I remember,
that means the downstream will be scheduled after upstream finishes, so the
slower downstream will not block upstream running, then the backpressure may
not exist in this case.
Best,
Zhijiang
--
发件人:Darshan Singh
发送时间
(groupby in your case) or decrease the parallelism of fast node(source in
your case).
Best,
Zhijiang
--
发件人:Darshan Singh
发送时间:2018年8月29日(星期三) 18:16
收件人:chesnay
抄 送:wangzhijiang999 ; user
主 题:Re: Backpressure? for Batches
Thanks, Now
You can check the log to show the related stack in OOM, maybe we can confirm
some reasons.
Or you can dump the heap to analyze the memory usages after OOM.
Best,
Zhijiang
--
发件人:Darshan Singh
发送时间:2018年8月29日(星期三) 19:22
收件人
,
Zhijiang
--
发件人:杨力
发送时间:2018年9月7日(星期五) 13:09
收件人:user
主 题:Flink 1.6 Job fails with IllegalStateException: Buffer pool is destroyed.
Hi all,
I am encountering a weird problem when running flink 1.6 in yarn per-job
clusters.
The job
1 - 100 of 123 matches
Mail list logo