Hi Yu, Hi Yun,

Brilliant idea! People are keen to use it. Thanks for your proposal! I was
wondering:

1. will it replace the current flame graph, i.e. the current flame graph
will be deprecated and removed?
2. does it make sense to provide the performance difference between enable
and disable it?

Best regards,
Jing

On Mon, Oct 9, 2023 at 1:50 PM Yu Chen <yuchen.e...@gmail.com> wrote:

> Hi zhanghao,
>
> Yes, agree with you. We'll take Jobmanager into consideration and update
> the FLIP later!
>
> Best,
> Yu Chen
>
> Zhanghao Chen <zhanghao.c...@outlook.com> 于2023年10月9日周一 19:22写道:
>
> > Hi Yun and Yu,
> >
> > Thanks for driving this. This would definitely help users identify
> > performance bottlenecks, especially for the cases where the bottleneck
> lies
> > in the system stack (e.g. GC), and big +1 for the downloadable flamegraph
> > to ease sharing. I'm wondering if we could add this for the job manager
> as
> > well. In the OLAP scenario and sometimes in the streaming scenario (when
> > there're some heavy operations during execution plan generation or in
> > operator coordinators), the JM can have bottleneck as well.
> >
> > Best,
> > Zhanghao Chen
> > ________________________________
> > From: Yu Chen <yuchen.e...@gmail.com>
> > Sent: Monday, October 9, 2023 17:24
> > To: dev@flink.apache.org <dev@flink.apache.org>
> > Subject: [DISCUSS] FLIP-375: Built-in cross-platform powerful java
> > profiler on taskmanagers
> >
> > Hi all,
> >
> > Yun Tang and I are opening this thread to discuss our proposal to
> integrate
> > async-profiler's capabilities for profiling taskmananger (e.g.,
> generating
> > flame graphs) in the Flink Web [1].
> >
> >
> > Currently, Flink provides ThreadDump and Operator-Level Flame Graphs by
> > sampling task threads. The results generated in such way missing the
> > relevant stack of java threads and system calls. The async-profiler[2]
> is a
> > low-overhead sampling profiler for Java, but the steps to use it in the
> > production environment are cumbersome and suffer from permissions and
> > security risks.
> >
> > Therefore, we propose adding rest APIs to provide the capability to
> invoke
> > async-profiler on multiple platforms through JNI, which can be easily
> > operated on Web UI. This enhancement will improve the efficiency and
> > experience of Flink users in identifying performance bottlenecks.
> >
> >
> >
> > Please refer to the FLIP document for more details about the proposed
> > design
> > and implementation. We welcome any feedback and opinions on this
> proposal.
> >
> >
> >
> > [1] FLIP-375: Built-in cross-platform powerful java profiler on
> > taskmanagers - Apache Flink - Apache Software Foundation
> > <
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-375%3A+Built-in+cross-platform+powerful+java+profiler+on+taskmanagers
> > >
> >
> > [2] GitHub - async-profiler/async-profiler: Sampling CPU and HEAP
> profiler
> > for Java featuring AsyncGetCallTrace + perf_events
> > <https://github.com/async-profiler/async-profiler>
> >
> >
> >
> > Best regards,
> >
> > Yun Tang and Yu Chen
> >
>

Reply via email to