You could enable object reuse [0] if you application allows that. Also adjusting the managed memory size [1] can help.

Are you using Flink's graph library Gelly?

[0] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/batch/index.html#object-reuse-enabled [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/config.html#managed-memory

Regards,
Timo

Am 23.08.17 um 17:11 schrieb Kaepke, Marc:
Does someone has a current performance test based on PageRank or an idea why 
Flink lost the comparison?


Am 18.08.2017 um 19:51 schrieb Kaepke, Marc <marc.kae...@haw-hamburg.de>:

Hi everyone,

I compared Flink and Spark by using PageRank. I guessed Flink will beat Spark 
or have the same level. But Spark is up to 4x faster then Flink.
I hope I did a mistake. So please help me to improve the performance of my 
cluster and config.

The cluster has 4 computers:
One JobManager (Quad Core with Hyper Threading -> 8 cores) and 16GB 
jobmanager.heap.mp))
Three TaskManager (each Quad Core with Hyper Threading -> 8 cores and 16GB 
(taskmanager.heap.mp))
In total 24 cores/ task slots.

I ran PR as vertex-centric, scatter-gather, gather-sum-apply and with bulk 
iteration. The parallelism was 24.
Runtime in ms:
Pregel: 90.000ms
SG: 64.000ms
GSA: 80.000ms
Bulk: 53.000ms
Spark with Pregel ran in 23.000ms

The input file was: https://snap.stanford.edu/data/wiki-topcats.html

Thanks for helping!

Marc


Reply via email to