The Apache Flink community is very happy to announce the release of Apache
Flink 2.0 Preview.
Apache Flink® is an open-source unified stream and batch data processing
framework for distributed, high-performing, always-available, and accurate
data applications.
This release is a preview of the upc
Dear Flink Community,
We are excited to share some important news with you!
Flink Forward Asia 2024 is coming up with two major events: the first in
Shanghai on November 29-30, and the second in Jakarta on December 5. These
gatherings will focus on the latest developments, future plans, and
pro
Dear Community,
I'm pleased to share this good news with everyone. As some of you may have
already heard, Apache Flink has won the 2023 SIGMOD Systems Award [1].
"Apache Flink greatly expanded the use of stream data-processing." --
SIGMOD Awards Committee
SIGMOD is one of the most influential da
> require all PRs that are merged to exist as a Github Issue?
> 3. There's no longer one central administration, which is especially
> valuable to track all issues across projects like the different connectors,
> Flink ML, Table Store etc.
> 4. Our current CI labeling works on the Jir
Hi devs and users,
As many of you may have already noticed, Infra announced that they will
soon disable public Jira account signups [1]. That means, in order for
someone who is not yet a Jira user to open or comment on an issue, he/she
has to first reach out to a PMC member to create an account fo
>> operator (usually considered as "numRecordsOut" of tasks).
>>>>> > > The original issue was that the numRecordsOut of the sink counted
>>>>> both (which is completely wrong).
>>>>> > >
>>>>> > > A new met
Actually, this is an on-going discussion related to 1.15.3. The community
discovered a breaking change in 1.15.x and is discussing how to resolve
this right now [1]. There is very likely a 1.15.3 release after this is
resolved.
Best,
Xintong
[1] https://lists.apache.org/thread/vxhty3q97s7pw2zn0
g TaskManagers won't make the JobMananger restart. You can
> provide the whole log as an attachment to investigate.
>
> On Wed, 12 Oct 2022 at 6:01 PM, Puneet Duggal
> wrote:
>
>> Hi Xintong Song,
>>
>> Thanks for your immediate reply. Yes, I do restart task man
The log shows that the jobmanager received a SIGTERM signal from external.
Depending on how you deploy Flink, that could be a 'kill ' command, or
a kubernetes pod removal / eviction, etc. You may want to check where the
signal came from.
Best,
Xintong
On Wed, Oct 12, 2022 at 6:26 AM Puneet Dug
+1 for reverting these changes in Flink 1.16.
For 1.15.3, can we make these metrics available via both names (numXXXOut
and numXXXSend)? In this way we don't break it for those who already
migrated to 1.15 and numXXXSend. That means we still need to change
SinkWriterOperator to use another metric
512mb is just too small for a TaskManager. You would need to either
increase it, or decrease the other memory components (which currently use
default values).
The 64mb Total Flink Memory comes from the 512mb Total Process Memory minus
192mb minimum JVM Overhead and 256mb default JVM Metaspace.
Be
Hi everyone,
I'm very happy to announce that the Apache Flink community has created a
dedicated Slack workspace [1]. Welcome to join us on Slack.
## Join the Slack workspace
You can join the Slack workspace by either of the following two ways:
1. Click the invitation link posted on the project w
iative is more about
making communication more efficient, rather than making information easier
to find.
Thank you~
Xintong Song
On Wed, May 11, 2022 at 5:39 PM Konstantin Knauf wrote:
> I don't think we can maintain two additional channels. Some people have
> already concerns about
I'm not
very familiar with Discourse or Reddit. My impression is that they are not
as easy to set up and maintain as Slack.
Thank you~
Xintong Song
[1] https://asktug.com/
On Tue, May 10, 2022 at 4:50 PM Konstantin Knauf wrote:
> Thanks for starting this discussion again. I am pretty
he global English-speaking community.
Concerning StackOverFlow, it definitely worth more attention from the
community. Thanks for the suggestion / reminder, Piotr & David. I think
Slack and StackOverFlow are probably not mutual exclusive.
Thank you~
Xintong Song
[1] https://zapier.com/
On Sat
Thank you~
Xintong Song
-- Forwarded message -
From: Xintong Song
Date: Fri, May 6, 2022 at 5:07 PM
Subject: Re: [Discuss] Creating an Apache Flink slack workspace
To: private
Cc: Chesnay Schepler
Hi Chesnay,
Correct me if I'm wrong, I don't find this is *
ading / sink writing data from
/ to external systems, are not counted. In your case, there's only 1
vertex in the DAG, thus no internal data exchanges.
Thank you~
Xintong Song
On Wed, Apr 6, 2022 at 11:21 PM Peter Schrott wrote:
> Hi there,
>
> I just successfully upgraded our Flink
ix the problem.
If the problem is not fixed, but the job runs longer before the OOM
happens, then it's likely the 3rd case. Moreover, you can monitor the pod
memory footprint changes if such metrics are available.
Thank you~
Xintong Song
On Tue, Feb 15, 2022 at 11:56 PM Alexey Trenikhun w
you share what that is for?
Thank you~
Xintong Song
On Tue, Feb 15, 2022 at 12:10 PM Alexey Trenikhun wrote:
> Hello,
> We use RocksDB, but there is no problem with Java heap, which is limited
> by 3.523gb, the problem with total container memory. The pod is killed
> not due OutO
mode
support shipping local dependencies.
- I'm not sure about dropping the per-job mode soonish, as many users are
still working with it. We'd better not force these users to migrate to the
application mode when upgrading the Flink version.
Thank you~
Xintong Song
On Fri, Jan 21,
Thanks for volunteering to drive this effort, Marton, Thomas and Gyula.
Looking forward to the public discussion. Please feel free to reach out if
there's anything you need from us.
Thank you~
Xintong Song
On Fri, Jan 14, 2022 at 8:27 AM Chenya Zhang
wrote:
> Thanks Thomas, Gy
f deployment's
resource requirements. In this way, users are free to choose between active
and reactive (e.g., HPA) rescaling, while always benefiting from the
beyond-deployment lifecycle (upgrades, savepoint management, etc.) and
alignment with the K8s ecosystem (Flink client free, operating via
`taskmanager.numberOfTaskSlots`.
If you have multiple jobs submitted to a shared Flink cluster, decreasing
the number of slots in a task manager should also reduce the amount of
classes loaded by the JVM, thus requiring less metaspace.
Thank you~
Xintong Song
On Mon, Dec 27, 2021 at 9:08 AM John Smith
o longer support hadoop versions < 2.8 at all. And
if that is not permitted by our users, we may consider to keep the codebase
as is and wait for a bit longer.
WDYT?
Thank you~
Xintong Song
[1]
https://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-common/Compatibility.html#Wire_co
job
needs, which probably depends on your hbase client configurations.
Thank you~
Xintong Song
On Wed, Dec 15, 2021 at 1:40 PM Anton wrote:
> Hi, from time to time my job is stopping to process messages with warn
> message listed below. Tried to increase jobmanager.memory.process.si
only pod evictions, but also other problems (jvm out-of-memory, remote
storage connection downtime, etc.).
Thank you~
Xintong Song
On Tue, Oct 26, 2021 at 7:39 AM Deshpande, Omkar
wrote:
> Hello,
>
> We are running flink on Kubernetes(Standalone) in application cluster
> mode. The
Hi everyone,
The RC0 for Apache Flink 1.14.0 has been created. This is still a
preview-only release candidate to drive the current testing efforts and so
no official votes will take place. It has all the artifacts that we would
typically have for a release, except for the release note and the webs
Thanks Yun and everyone~!
Thank you~
Xintong Song
On Mon, Aug 9, 2021 at 10:14 PM Till Rohrmann wrote:
> Thanks Yun Tang for being our release manager and the great work! Also
> thanks a lot to everyone who contributed to this release.
>
> Cheers,
> Till
>
> On Mon, A
erhead.[min|max|fraction]'). That helps reserve
more native memory in the Kubernetes pod.
Thank you~
Xintong Song
On Fri, Jul 2, 2021 at 11:51 AM Sudharsan R wrote:
> Hi Xintong,
> Thanks very much for the response. Let me check out the new UI on flink
> 1.12.
>
> The reason I
very often lead to confusions.
Since Flink-1.12, we have introduced a new web ui for the memory metrics,
where the legacy metrics are preserved only for backward compatibility and
are placed in an `Advanced` pane. I'd recommend ignoring them in 99% of the
cases.
Thank you~
Xintong Song
ies
in a session cluster [3]. Please be aware that in standalone Kubernetes
deployment, Flink looks for log4j-console.properties instead of
log4j.properties. By default, this will write the logs to stdout, so that
the logs can be viewed by the `kubectl logs` command.
Thank you~
Xintong Song
[
Hi Thomas,
It would be helpful if you can provide the jobmanager/taskmanager logs, and
gc logs if possible.
Additionally, you may consider to monitor the cpu/memory related metrics
[1], see if there's anything abnormal when the problem is observed.
Thank you~
Xintong Song
[1]
modeled
as a special case of general control messages.
- Watermarks are probably similar to the other control messages. However,
it's already exposed to users as public APIs. If we want to migrate it to
the new control flow, we'd be very careful not to break any compatibility.
Thank you~
events from JobMaster
3. Consume control events from arbitrary operators downstream where
the events are produced
Thank you~
Xintong Song
On Tue, Jun 8, 2021 at 1:37 PM Yun Gao wrote:
> Very thanks Jiangang for bringing this up and very thanks for the
> discussion!
>
>
ntrolling feature, but
potentially other future features as well.
- AFAICS, it's non-trivial to make a 3rd-party dynamic configuration
framework work together with Flink's consistency mechanism.
Thank you~
Xintong Song
On Mon, Jun 7, 2021 at 11:05 AM 刘建刚 wrote:
> Thank you for t
session cluster. Thus, status of historical jobs can be
accessed via the JM.
2. You can try setting up a history server [1], where information of
finished jobs can be archived.
Thank you~
Xintong Song
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/advanced
straightforward.
Unfortunately, these efforts are still in progress, and are more or less
staled recently.
Thank you~
Xintong Song
[1] https://issues.apache.org/jira/browse/FLINK-20681
[2] https://issues.apache.org/jira/browse/FLINK-20811
[3] https://issues.apache.org/jira/browse/FLINK-20867
On
Yes, it does.
Internally, each re-scheduling is performed as stop-and-resume the job,
similar to a failover. Without checkpoints, the job will always restore
from the very beginning.
Thank you~
Xintong Song
On Mon, May 17, 2021 at 2:54 PM Alexey Trenikhun wrote:
> Hi Xintong,
>
s. This is usually observed for large scale jobs (in terms of
number of vertices and parallelism). In that case, we would have to
increase the heartbeat timeout.
Thank you~
Xintong Song
On Mon, May 17, 2021 at 11:12 AM Smile wrote:
> JM log shows this:
>
> INFO org.apache.fli
work with both the default and the new reactive modes.
Thank you~
Xintong Song
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/state/checkpoints/#unaligned-checkpoints
On Fri, May 14, 2021 at 11:29 PM Alexey Trenikhun wrote:
> Hello,
>
> Is new rea
eed by the checkpointing
mechanism. The new task does not resume from the exact position where the
old task is stopped. Instead, it resumes from the last successful
checkpoint.
Thank you~
Xintong Song
On Thu, May 13, 2021 at 5:38 PM Guowei Ma wrote:
> Hi,
> In fact, not only JobManager
guration option `kubernets.taskmanager.cpu`
controls the cpu resource of pods Flink requests from Kubernetes.
Thank you~
Xintong Song
On Fri, May 7, 2021 at 10:35 AM Fan Xie wrote:
> Hi Flink Community,
>
> Recently I am working on an auto-scaling project that needs to dynamically
> adjust the cpu
Thanks Dawid & Guowei as the release managers, and everyone who has
contributed to this release.
Thank you~
Xintong Song
On Thu, May 6, 2021 at 9:51 AM Leonard Xu wrote:
> Thanks Dawid & Guowei for the great work, thanks everyone involved.
>
> Best,
> Leonard
>
&
Thanks for driving this, Konstantin.
Great job~!
Thank you~
Xintong Song
On Thu, Apr 22, 2021 at 11:57 PM Matthias Pohl
wrote:
> Thanks for setting this up, Konstantin. +1
>
> On Thu, Apr 22, 2021 at 11:16 AM Konstantin Knauf
> wrote:
>
>> Hi everyone,
>>
>&
These metrics should also be available via REST.
You can check the original design doc [1] for which metrics the UI is using.
Thank you~
Xintong Song
[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-102%3A+Add+More+Metrics+to+TaskManager
On Tue, Apr 13, 2021 at 9:08 PM Alexis Sarda
-XX:MaxDirectMemorySize'
and is not controlled by the garbage collectors.
Thank you~
Xintong Song
On Tue, Apr 13, 2021 at 7:53 PM Alexis Sarda-Espinosa <
alexis.sarda-espin...@microfocus.com> wrote:
> Hello,
>
>
>
> I have a Flink TM configured with taskmanager.memory.mana
+1
It's already a matter of fact for a while that we no longer port new
features to the Mesos deployment.
Thank you~
Xintong Song
On Fri, Mar 26, 2021 at 10:37 PM Till Rohrmann wrote:
> +1 for officially deprecating this component for the 1.13 release.
>
> Cheers,
> Till
&
If all the tasks have the same parallelism 36, your job should only
allocate 36 slots. The evenly-spread-out-slots option should help in your
case.
Is it possible for you to share the complete jobmanager logs?
Thank you~
Xintong Song
On Tue, Mar 16, 2021 at 12:46 AM Aeden Jameson
wrote
s containing a subtask of it, and there's no
guarantee which 36 out of the 54 contain it.
Thank you~
Xintong Song
On Mon, Mar 15, 2021 at 3:54 AM Chesnay Schepler wrote:
> Is this a brand-new job, with the cluster having all 18 TMs at the time
> of submission? (or did you add more TMs
Hi Hemant,
I don't see any problem in your settings. Any exceptions suggesting why TM
containers are not coming up?
Thank you~
Xintong Song
On Sat, Mar 6, 2021 at 3:53 PM bat man wrote:
> Hi Xintong Song,
> I tried using the java options to generate heap dump referring to docs[1]
memory leak is suspected, to further understand where the memory is
consumed, you may need to dump the heap on OOMs and looking for unexpected
memory usages leveraging profiling tools.
Thank you~
Xintong Song
[1]
https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/memleaks002.html
cess, such
as tremendous memory consumption, buzy rpc main thread, etc. To make that
case work, we did many optimizations on our internal flink version, which
we are trying to contribute to the community. See FLINK-21110 [1] for the
details.
Thank you~
Xintong Song
[1] https://issues.apache.org/j
What you're looking for might be Session Window[1].
Thank you~
Xintong Song
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/stream/operators/windows.html#session-windows
On Fri, Feb 19, 2021 at 7:35 PM ゞ野蠻遊戲χ wrote:
> hi all
>
> For example, if A
0 bytes
> INFO [] - Network: 128.000mb (134217730 bytes)
> INFO [] - JVM Metaspace: 256.000mb (268435456 bytes)
> INFO [] - JVM Overhead: 192.000mb (201326592 bytes)
Thank you~
Xintong Song
On Tue, Feb 2, 2021
Hi Randal,
The image is too blurred to be clearly seen.
I have a few questions.
- IIUC, you are using the standalone K8s deployment [1], not the native K8s
deployment [2]. Could you confirm that?
- How is the memory measured?
Thank you~
Xintong Song
[1]
https://ci.apache.org/projects/flink
re of any issue related to
the upgrading of the ZK version that may cause the leadership loss.
Thank you~
Xintong Song
On Sun, Jan 31, 2021 at 4:14 AM Colletta, Edward
wrote:
> “but I'm not aware of any similar issue reported since the upgrading”
>
> For the record, we experienced th
Thank you~
Xintong Song
On Sat, Jan 30, 2021 at 8:27 AM Xintong Song wrote:
> There's indeed a ZK version upgrading during 1.9 and 1.11, but I'm not
> aware of any similar issue reported since the upgrading.
> I would suggest the following:
> - Turn on the DEBUG
Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12348668
We would like to thank all contributors of the Apache Flink community who
made this release possible!
Regards,
Xintong Song
r complain about timeout, and there's no gc
issue spotted, I would consider a network instability.
Thank you~
Xintong Song
On Fri, Jan 29, 2021 at 3:15 AM Lu Niu wrote:
> After checking the log I found the root cause is zk client timeout on TM:
> ```
> 2021
The Apache Flink community is very happy to announce the release of Apache
Flink 1.12.1, which is the first bugfix release for the Apache Flink 1.12
series.
Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
a
ed as `yarn.ship-files`, `yarn.ship-archives` or
`yarn.provided.lib.dirs`? This helps us to locate the code path that this
file went through.
Thank you~
Xintong Song
On Sun, Jan 17, 2021 at 10:32 PM Mark Davis wrote:
> Hi all,
> I am upgrading my DataSet jobs from Flink 1.8 to 1.12.
> Aft
I believe what you are looking for is the State TTL [1][2].
Thank you~
Xintong Song
[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#state-time-to-live-ttl
[2]
https://ci.apache.org/projects/flink/flink-docs-stabledev/table/config.html#table-exec-state
The Apache Flink community is very happy to announce the release of Apache
Flink 1.11.3, which is the third bugfix release for the Apache Flink 1.11
series.
Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
a
I'm not aware of any significant changes to the HA components between
1.9/1.11.
Would you mind sharing the complete jobmanager/taskmanager logs?
Thank you~
Xintong Song
On Fri, Dec 18, 2020 at 8:53 AM Lu Niu wrote:
> Hi, Xintong
>
> Thanks for replying and your suggestion. I
into the ZooKeeper logs checking why RM's
leadership is revoked.
Thank you~
Xintong Song
On Thu, Dec 17, 2020 at 8:42 AM Lu Niu wrote:
> Hi, Flink users
>
> Recently we migrated to flink 1.11 and see exceptions like:
> ```
> 2020-12-
FYI, I've opened FLINK-20503 for this.
https://issues.apache.org/jira/browse/FLINK-20503
Thank you~
Xintong Song
On Mon, Dec 7, 2020 at 11:10 AM Xintong Song wrote:
> I forgot to mention that it is designed that task managers always have
> `Double#MAX_VALUE` cpu cores in loca
sers. Will fire an issue on that.
Thank you~
Xintong Song
On Mon, Dec 7, 2020 at 11:03 AM Xintong Song wrote:
> Hi Rex,
>
> We're running this in a local environment so that may be contributing to
>> what we're seeing.
>>
> Just to double check on this. By `
requested
in such cases.
- kubernetes.jobmanager.cpu
- kubernetes.taskmanager.cpu
- yarn.appmaster.vcores
- yarn.containers.vcores
- mesos.resourcemanager.tasks.cpus
Thank you~
Xintong Song
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/memo
on the decommissioning node will be killed.
Thank you~
Xintong Song
On Fri, Nov 13, 2020 at 2:57 PM Robert Metzger wrote:
> Hi,
> it seems that YARN has a feature for targeting specific hardware:
> https://hadoop.apache.org/docs/r3.1.0/hadoop-yarn/hadoop-yarn-site/PlacementConstraints.htm
Hi Schneider,
The error message suggests that your task managers are not configured with
enough network memory. You would need to increase the network memory
configuration. See this doc [1] for more details.
Thank you~
Xintong Song
[1]
https://ci.apache.org/projects/flink/flink-docs-release
, you might want to look into this comment [1] in FLINK-18712.
- If neither of the above actions helps, we might need to leverage tools
(e.g., JVM NMT [2]) to track the native memory usages and see where exactly
the leak comes from.
Thank you~
Xintong Song
[1]
https://issues.apache.org/jira/b
upgrade to 1.10.2, to include the latest bug fixes on the 1.10 release.
Thank you~
Xintong Song
On Thu, Oct 29, 2020 at 4:41 PM Ori Popowski wrote:
> Hi,
>
> PID 20331 is indeed the Flink process, specifically the TaskManager
> process.
>
> - Workload is a streaming workload
can also try increasing the `jvm-overhead`, simply to leave more native
memory in the container in case there are other other significant native
memory usages.
Thank you~
Xintong Song
On Wed, Oct 28, 2020 at 5:53 PM Ori Popowski wrote:
> Hi Xintong,
>
> See here:
>
> # Top me
n the `top` command
- Look into the `/proc/meminfo` file
- Any container memory usage metrics that are available to your Yarn cluster
Thank you~
Xintong Song
On Tue, Oct 27, 2020 at 6:21 PM Ori Popowski wrote:
> After the job is running for 10 days in production, TaskManagers start
> f
early next month. It would be greatly
appreciated if you fork as experienced Flink on Mesos users can help with
verifying the release candidates.
Thank you~
Xintong Song
[1]
https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F
resource management improvements may not be ported to Mesos), while
keeping other components up-to-date (e.g., improvements from programming
APIs, operators, state backens, etc.)?
Thank you~
Xintong Song
On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
lassenedergaardfl...@gmail.com> wrote:
think it should be
fine.
Thank you~
Xintong Song
[1] https://issues.apache.org/jira/browse/FLINK-19665
On Sat, Oct 24, 2020 at 5:56 AM Dan Hill wrote:
> Changing down to maven 3.2 shows an error. It seems like I'm hitting
> flaky tests. I hit one error and then a different error
oices definitely matter a lot for this community. Either way, it would be
good to draw users attention to this discussion early.
Thank you~
Xintong Song
On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf wrote:
> Hi Robert,
>
> +1 to the plan you outlined. If we were to drop support in F
3.6.3.
I'm not sure whether the maven version is related, but maybe you can try it
out with 3.2.5. And if it turns out worked, we may fire a issue at the
Apache Maven community.
Thank you~
Xintong Song
On Thu, Oct 22, 2020 at 12:31 PM Dan Hill wrote:
> 1) I don't see anything use
n
logs.
- Quick question: which PR are you working on? By any chance you called
`System.exit()` in your codes?
Thank you~
Xintong Song
On Thu, Oct 22, 2020 at 5:59 AM Dan Hill wrote:
> Sure, here's a link
> <https://drive.google.com/file/d/13Q7h77zG-2vp7gJOke8QAzLtKLKIPuTf/view?usp=sh
Would you be able to share the complete maven logs and the command? And
what is the maven version?
Thank you~
Xintong Song
On Wed, Oct 21, 2020 at 1:37 AM Dan Hill wrote:
> Hi Xintong!
>
> No changes. I tried -X and no additional log information is logged.
> -DfailIfNoTests=fa
intended to execute the tests locally, you can try the following
actions. I'm not sure whether that helps though.
- Try to add '-DfailIfNoTests=false' to your maven command.
- Execute the maven command with '-X' to print all the debug logs.
Thank you~
Xintong Song
On Tu
No worries :)
Thank you~
Xintong Song
On Mon, Oct 12, 2020 at 2:48 PM Paul Lam wrote:
> Sorry for the misspelled name, Xintong
>
> Best,
> Paul Lam
>
> 2020年10月12日 14:46,Paul Lam 写道:
>
> Hi Xingtong,
>
> Thanks a lot for the pointer!
>
> It’s good to
FYI, I just created FLINK-19568 for tracking this issue.
Thank you~
Xintong Song
[1] https://issues.apache.org/jira/browse/FLINK-19568
On Mon, Oct 12, 2020 at 2:18 PM Xintong Song wrote:
> Hi Paul,
>
> Thanks for reporting this.
>
> Indeed, Flink's RM currently p
al states in the rpc main thread. With FLINK-19241, this can be
achieved easily by delegating the work to the io executor.
Thank you~
Xintong Song
On Mon, Oct 12, 2020 at 12:44 PM Paul Lam wrote:
> Hi,
>
> After FLINK-13184 is implemented (even with Flink 1.11), occasionally
>
g released,
see if we can do something about it.
Thank you~
Xintong Song
On Thu, Sep 24, 2020 at 6:35 PM Claude M wrote:
> I have 35 task managers, 1 slot on each. I'm running a total of 7 jobs in
> the cluster. All the slots are occupied. When you say that
] and build your custom image (from the 1.0.2 image and
replace the flink distribution with the one you built).
Thank you~
Xintong Song
[1] https://github.com/apache/flink/tree/release-1.10
[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.10/flinkDev/building.html
On Wed, S
that fixes
your problem.
Given that it could take weeks to reproduce your problem, I would suggest
to keep track of the native memory usage with jemalloc and jeprof. This
should provide direct information about which component is using extra
memory.
Thank you~
Xintong Song
On Tue, Sep 22
Thanks for the input, Brain.
This looks like what we are looking for. The issue is fixed in 1.10.3,
which also matches this problem occurred in 1.10.2.
Maybe Claude can further confirm it.
Thank you~
Xintong Song
On Tue, Sep 22, 2020 at 10:57 AM Zhou, Brian wrote:
> Hi Xintong and Cla
dump, we can look into it later.
Thank you~
Xintong Song
On Mon, Sep 21, 2020 at 9:37 PM Claude M wrote:
> Hi Xintong,
>
> Thanks for your reply. Here is the command output w/ the java.opts:
>
> /usr/local/openjdk-8/bin/java -Xms768m -Xmx768m -XX:+UseG1GC
> -XX:+Hea
t trust Flink's "Non-Heap" metrics. It is
practically helpless and misleading. The "Non-Heap" accounts for SOME of
the non-heap memory usage, but NOT ALL of them. The community is working on
a new set of metrics and Web UI for the task manager memory tuning.
Thank you~
Xinton
do.
>
- Which Flink's kubernetes deployment are you using? The standalone or
native Kubernetes?
- Which cluster mode are you using? Job cluster, session cluster, or the
application mode?
Thank you~
Xintong Song
On Sat, Sep 19, 2020 at 1:22 AM Claude M wrote:
> Hello,
>
> I upgrad
rs can write/remove the stored object. What if the previous owner
failed to release the lock (e.g., dead before releasing)? Would there be
any problem?
## HA storage > HA data clean up
If the ConfigMap is destroyed on `kubectl delete deploy `, how
are the HA dada retained?
Thank you~
Xintong So
, thus separating the pipeline into
several slot sharing groups will not bring any benefit. If you are just
trying out with the slot sharing groups or preparing for later deploying
the execution to a distributed cluster, then there should be no problem.
Thank you~
Xintong Song
On Thu, Sep 10, 20
] the
cluster to allocate slots evenly across task managers.
Thank you~
Xintong Song
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.11/concepts/flink-architecture.html#tasks-and-operator-chains
[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.11/internals/job_scheduling
you~
Xintong Song
On Mon, Aug 31, 2020 at 1:33 PM lec ssmi wrote:
> HI:
> Generally speaking, when we submitting the flink program, the number of
> taskmanager and the memory of each tn will be specified. And the smallest
> real execution unit of flink should be operator.
Congratulations Dian~!
Thank you~
Xintong Song
On Thu, Aug 27, 2020 at 7:42 PM Jark Wu wrote:
> Congratulations Dian!
>
> Best,
> Jark
>
> On Thu, 27 Aug 2020 at 19:37, Leonard Xu wrote:
>
> > Congrats, Dian! Well deserved.
> >
> > Best
> > Le
Hi Vishwas,
According to the log, heap space is 13+GB, which looks fine.
Several reason might lead to the heap space OOM:
- Memory leak
- Not enough GC threads
- Concurrent GC starts too late
- ...
I would suggest taking a look at the GC logs.
Thank you~
Xintong Song
On Fri
ption
`taskmanager.host` for your task managers, see if that is reflected in the
metrics.
Thank you~
Xintong Song
On Wed, Aug 12, 2020 at 3:06 PM Nikola Hrusov wrote:
> Hello,
>
> After upgrading the docker image for flink to 1.11.1 from 1.9 the hostname
> of the taskmanagers reported to
I did a simple test on my
laptop, launching a docker container with cpu limit configured. Inside the
container, I can still see all my machine's cpus.
Thank you~
Xintong Song
On Wed, Aug 12, 2020 at 1:19 AM Bajaj, Abhinav
wrote:
> Hi,
>
>
>
> Reaching out to folks running Fl
1 - 100 of 219 matches
Mail list logo