I am a beginner in MPI. I ran an example code using OpenMPI and it seems work. And then I tried a parallel example in PETSc tutorials folder (ex5).
mpirun -np 4 ex5 It can do but the results are not as accurate as just running ex5. Is that thing normal? After that, send this job to supercomputer which allocates me 4 nodes and each node has 8 processors. When I check load on each node, I found: Node LOAD CPU 0 32 800 1 0 0 2 0 0 3 0 0 But for other's job, they got Node LOAD 0 8 1 8 2 8 3 8 It seems the master node take all the load and the speed is even lower than it works on single processor. Does anyone have any idea about this? Thank you in advance! Sincerely, YIN