Dear All, I'm trying to speed up the OpenMP implementation in GCC and Intel compiler. I'm working on multisort because i thought it has too much branches and i applied tree model task. I used Paraver and Extrae to profile program After that, I executed them on a machine Minotauro supercomputer in top500 with Intel Xeon E5649 E5649 (6-Core, each core has 2 threads) a 2.53 GHz.
The following report shows the OpenMP in GCC scheduling vs Intel C Compiler https://github.com/grypp/gcc-gsoc-taskscheduler/raw/master/report.pdf And Here is my omp code https://github.com/grypp/gcc-gsoc-taskscheduler/blob/master/multisort-omp.c I thought gcc tasks/threads waiting too much on the idle than intel compiler's threads. In addition to I tried to profiling nanos task scheduler with mercurium compiler. I used 3 scheduler types in nanos. These're respectively -Work-First (WF) -Breadth-First (BF) -Cilk (CILK) I run this program on computer has 12 processors. The following image shows the trace of schedulers: https://raw.github.com/grypp/gcc-gsoc-taskscheduler/master/nns.png And Here is my ompss code https://github.com/grypp/gcc-gsoc-taskscheduler/blob/master/multisort-ompss-tree.c In addition to i got speed up interestingly breadth-first more than others. Texecution times: -Work-First (WF): ~0.5 -Breadth-First (BF): ~0.22 -Cilk (CILK): ~0.49 I also read this mail "http://gcc.gnu.org/ml/gcc/2011-04/msg00040.html". I suppose lazy task creatation algorithm was implemented on gcc. I do not know. I wonder? Lazy task creation algorithm may good. But task scheduling is not good. Currently, I'm planning to change task scheduling algorithm with BF. But i am not sure it's good or not. In addition to i'll ask some idea related task scheduling my professor who is the main writer papers that gave you and is director of the barcelona supercomputer center. Are there any advice your? Regards, Güray Özen Polytechnic University of Catalonia 2013/4/26 guray.ozen <guray.o...@gmail.com>: > Hi, > > I'm MSc High-Performance Computing student at Polytechnic University > of Catalonia(BarcelonaTech). I'm interesting openmp task scheduling > optimization or openmp 3.1 facility taskyield. > > @For Task scheduling > I'm using mercurium compiler already at my university because the > compiler was developed by my university. In fact i'm working on > openACC integration at mercurium. However i haven't enough knowledge > nanos scheduler. > > I read 3 articles that you recommend related task scheduling. Cilk and > work-first scheduling algorithms makes sense for me with cut-off > mechanism. I suppose cut-off mechanism will decide on run-time. > > Now I'm trying to understand which task scheduler is better in some > cases at nanos scheduler > (https://pm.bsc.es/projects/nanox/wiki/UserManual/Schedule). I'll > analyze result with PARAVER profiling tool. > > However i am not sure for this topics. Should i develop a new > task-scheduler? otherwise should I optimize existing scheduler? > > > King regards, > > > Güray Özen > Polytechnic University of Catalonia