Let total time on my slot 0 process be S+C+B+I = serial computations + communication + busy wait + idle Is there a way to find out S? S+C would probably also be useful, since I assume C is low.
The problem is that I = 0, roughly, and B is big. Since B is big, the usual process timing methods don't work. If B all went to "system" as opposed to "user" time I could use the latter, but I don't think that's the case. Can anyone confirm that? If S is big, I might be able to gain by parallelizing in a different way. By S I mean to refer to serial computation that is part of my algorithm, rather than the technical fact that all the computation is serial on a given slot. I'm running R/RMPI. Thanks. Ross