Dear all, I spent quite some time on our in-house CFD and FSI solvers, which are matrix-based and use deal.II, MPI and AMG packages of Trilinos and PETSc, all of which are so wonderfully accessible even for engineers like me. My computations now focused on problems with relatively small DoF count - say, max. 10 mio. - and the number of mpi ranks was eye-balled, staying below 20. At this stage, I would like to know
a) which (free) profiling tools can you recommend? I watched the video lecture of Wolfgang about that topic, but was looking for more opinions! I want to see which parts of the code take time apart from the (already detailed) TimerOutput. b) If I use simply "mpirun -n 4 mycode" on a machine with 8 physical cores, why do both PETSc and Trilinos use 8 cores during the AMG setup and solve? I observed that using the htop command, even when using an off-the-shelf "step-40.release" as included in the library. Does anyone else see that? It looks something like this during the AMG setup and solve for "mpirun -n 8 step-40": [image: screenshot_trilinos_step40_mpirun_n_8.png] It might be linked to the installation on the server, where I used candi. On my local machine, however, this does not happen. Any hints are very much welcome, thanks for reading and any tips! Best regards & greetings from Graz Richard -- The deal.II project is located at http://www.dealii.org/ For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en --- You received this message because you are subscribed to the Google Groups "deal.II User Group" group. To unsubscribe from this group and stop receiving emails from it, send an email to dealii+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/01204e4d-1a90-4cfa-b5a2-a94b19d5cb6en%40googlegroups.com.