I haven't run any benchmarks with Flink or even used it enough to directly help with your question, however I suspect that the following article might be relevant:

http://dsrg.pdos.csail.mit.edu/2016/06/26/scalability-cost/

Given the computation you're performing is trivial, it's possible that the additional overhead of serialisation, interprocess communication, state management etc that distributed systems like Flink require are dominating the runtime here. 2 hours (or even 25 minutes) still seems too long to me however, so hopefully it really is just a configuration issue of some sort. Either way, if you do figure this out or anyone with good knowledge of the article above in relation to Flink is able to give their thoughts, I'd be very interested in hearing more.

Regards,
Chris


------ Original Message ------
From: "Habib Mostafaei" <ha...@inet.tu-berlin.de>
To: "Zhenghua Gao" <doc...@gmail.com>
Cc: "user" <user@flink.apache.org>; "Georgios Smaragdakis" <georg...@inet.tu-berlin.de>; "Niklas Semmler" <nik...@inet.tu-berlin.de>
Sent: 30/10/2019 12:25:28
Subject: Re: low performance in running queries

Thanks Gao for the reply. I used the parallelism parameter with different values like 6 and 8 but still the execution time is not comparable with a single threaded python script. What would be the reasonable value for the parallelism?

Best,

Habib

On 10/30/2019 1:17 PM, Zhenghua Gao wrote:
The reason might be the parallelism of your task is only 1, that's too low. See [1] to specify proper parallelism for your job, and the execution time should be reduced significantly.

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html

Best Regards,
Zhenghua Gao


On Tue, Oct 29, 2019 at 9:27 PM Habib Mostafaei <ha...@inet.tu-berlin.de> wrote:
Hi all,

I am running Flink on a standalone cluster and getting very long
execution time for the streaming queries like WordCount for a fixed text
file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I
have a text file with size of 2GB. When I run the Flink on a standalone cluster, i.e., one JobManager and one taskManager with 25GB of heapsize,
it took around two hours to finish counting this file while a simple
python script can do it in around 7 minutes. Just wondering what is
wrong with my setup. I ran the experiments on a cluster with six
taskManagers, but I still get very long execution time like 25 minutes
or so. I tried to increase the JVM heap size to have lower execution
time but it did not help. I attached the log file and the Flink
configuration file to this email.

Best,

Habib

--
Habib Mostafaei, Ph.D.
Postdoctoral researcher
TU Berlin,
FG INET, MAR 4.003
Marchstraße 23, 10587 Berlin

Reply via email to