Hi,
Unfortunately your VisualVM snapshot doesn’t contain the profiler output. It
should look like this [1].
> Checking the timeline of execution shows that the source operation is done in
> less than a second while Map and Reduce operations take long running time.
It could well be that the ove
Hi,
On 11/1/2019 4:40 PM, Piotr Nowojski wrote:
Hi,
More important would be the code profiling output. I think VisualVM
allows to share the code profiling result as “snapshots”? If you could
analyse or share this, it would be helpful.
Enclosed is a snapshot of VisualVM.
From the attached s
Hi,
I ran the streaming WordCount with a 2GB text file(copied
/usr/share/dict/words 400 times) last weekend and didn't reproduce your
result(16 minutes in my case).
But i find some clues may help you:
The streaming WordCount job would output all intermedia result in your
output file(if specified)
Hi,
More important would be the code profiling output. I think VisualVM allows to
share the code profiling result as “snapshots”? If you could analyse or share
this, it would be helpful.
From the attached screenshot the only thing that is visible is that there are
no GC issues, and secondly th
Hi Piotrek,
Thanks for the list of profilers. I used VisualVM and here is the
resource usage for taskManager.
Habib
On 11/1/2019 9:48 AM, Piotr Nowojski wrote:
Hi,
> Is there a simple way to get profiling information in Flink?
Flink doesn’t provide any special tooling for that. Just use
Hi,
> Is there a simple way to get profiling information in Flink?
Flink doesn’t provide any special tooling for that. Just use your chosen
profiler, for example: Oracle’s Mission Control (free on non production
clusters, no need to install anything if already using Oracle’s JVM), VisualVM
(I
I used streaming WordCount provided by Flink and the file contains text
like "This is some text...". I just copied several times.
Best,
Habib
On 11/1/2019 6:03 AM, Zhenghua Gao wrote:
2019-10-30 15:59:52,122 INFO org.apache.flink.runtime.taskmanager.Task
- Split Reader:
2019-10-30 15:59:52,122 INFO
org.apache.flink.runtime.taskmanager.Task - Split
Reader: Custom File Source -> Flat Map (1/1)
(6a17c410c3e36f524bb774d2dffed4a4) switched from DEPLOYING to RUNNING.
2019-10-30 17:45:10,943 INFO
org.apache.flink.runtime.taskmanager.Task
I enclosed all logs from the run and for this run I used parallelism
one. However, for other runs I checked and found that all parallel
workers were working properly. Is there a simple way to get profiling
information in Flink?
Best,
Habib
On 10/31/2019 2:54 AM, Zhenghua Gao wrote:
I think
I think more runtime information would help figure out where the problem is.
1) how many parallelisms actually working
2) the metrics for each operator
3) the jvm profiling information, etc
*Best Regards,*
*Zhenghua Gao*
On Wed, Oct 30, 2019 at 8:25 PM Habib Mostafaei
wrote:
> Thanks Gao for t
;Georgios
> Smaragdakis" <mailto:georg...@inet.tu-berlin.de>>; "Niklas Semmler"
> mailto:nik...@inet.tu-berlin.de>>
> Sent: 30/10/2019 12:25:28
> Subject: Re: low performance in running queries
>
>> Thanks Gao for the reply. I used the parallelis
To: "Zhenghua Gao"
Cc: "user" ; "Georgios Smaragdakis"
; "Niklas Semmler"
Sent: 30/10/2019 12:25:28
Subject: Re: low performance in running queries
Thanks Gao for the reply. I used the parallelism parameter with
different values like 6 and 8 but still the
The reason might be the parallelism of your task is only 1, that's too low.
See [1] to specify proper parallelism for your job, and the execution time
should be reduced significantly.
[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html
*Best Regards,*
*Zhenghua Gao*
On
Thanks Gao for the reply. I used the parallelism parameter with
different values like 6 and 8 but still the execution time is not
comparable with a single threaded python script. What would be the
reasonable value for the parallelism?
Best,
Habib
On 10/30/2019 1:17 PM, Zhenghua Gao wrote:
Th
14 matches
Mail list logo