Hi, > Is there a simple way to get profiling information in Flink?
Flink doesn’t provide any special tooling for that. Just use your chosen profiler, for example: Oracle’s Mission Control (free on non production clusters, no need to install anything if already using Oracle’s JVM), VisualVM (I think free), YourKit (paid). For each one of them there is a plenty of online support how to use them both for local and remote profiling. Piotrek > On 31 Oct 2019, at 14:05, Habib Mostafaei <ha...@inet.tu-berlin.de> wrote: > > I enclosed all logs from the run and for this run I used parallelism one. > However, for other runs I checked and found that all parallel workers were > working properly. Is there a simple way to get profiling information in Flink? > > Best, > > Habib > > On 10/31/2019 2:54 AM, Zhenghua Gao wrote: >> I think more runtime information would help figure out where the problem is. >> 1) how many parallelisms actually working >> 2) the metrics for each operator >> 3) the jvm profiling information, etc >> >> Best Regards, >> Zhenghua Gao >> >> >> On Wed, Oct 30, 2019 at 8:25 PM Habib Mostafaei <ha...@inet.tu-berlin.de >> <mailto:ha...@inet.tu-berlin.de>> wrote: >> Thanks Gao for the reply. I used the parallelism parameter with different >> values like 6 and 8 but still the execution time is not comparable with a >> single threaded python script. What would be the reasonable value for the >> parallelism? >> >> Best, >> >> Habib >> >> On 10/30/2019 1:17 PM, Zhenghua Gao wrote: >>> The reason might be the parallelism of your task is only 1, that's too low. >>> See [1] to specify proper parallelism for your job, and the execution time >>> should be reduced significantly. >>> >>> [1] >>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html >>> <https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html> >>> >>> Best Regards, >>> Zhenghua Gao >>> >>> >>> On Tue, Oct 29, 2019 at 9:27 PM Habib Mostafaei <ha...@inet.tu-berlin.de >>> <mailto:ha...@inet.tu-berlin.de>> wrote: >>> Hi all, >>> >>> I am running Flink on a standalone cluster and getting very long >>> execution time for the streaming queries like WordCount for a fixed text >>> file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I >>> have a text file with size of 2GB. When I run the Flink on a standalone >>> cluster, i.e., one JobManager and one taskManager with 25GB of heapsize, >>> it took around two hours to finish counting this file while a simple >>> python script can do it in around 7 minutes. Just wondering what is >>> wrong with my setup. I ran the experiments on a cluster with six >>> taskManagers, but I still get very long execution time like 25 minutes >>> or so. I tried to increase the JVM heap size to have lower execution >>> time but it did not help. I attached the log file and the Flink >>> configuration file to this email. >>> >>> Best, >>> >>> Habib >>> > > <flink-xxx-client-xxx.log><flink-xxx-standalonesession-0-xxx.log><flink-xxx-taskexecutor-0-xxx.log>