I run ompi_info --param btl sm and this is the output MCA btl: parameter "btl_base_debug" (current value: "0") If btl_base_debug is 1 standard debug is output, if > 1 verbose debug is output MCA btl: parameter "btl" (current value: <none>) Default selection set of components for the btl framework (<none> means "use all components that can be found") MCA btl: parameter "btl_base_verbose" (current value: "0") Verbosity level for the btl framework (0 = no verbosity) MCA btl: parameter "btl_sm_free_list_num" (current value: "8") MCA btl: parameter "btl_sm_free_list_max" (current value: "-1") MCA btl: parameter "btl_sm_free_list_inc" (current value: "64") MCA btl: parameter "btl_sm_exclusivity" (current value: "65535") MCA btl: parameter "btl_sm_latency" (current value: "100") MCA btl: parameter "btl_sm_max_procs" (current value: "-1") MCA btl: parameter "btl_sm_sm_extra_procs" (current value: "2") MCA btl: parameter "btl_sm_mpool" (current value: "sm") MCA btl: parameter "btl_sm_eager_limit" (current value: "4096") MCA btl: parameter "btl_sm_max_frag_size" (current value: "32768") MCA btl: parameter "btl_sm_size_of_cb_queue" (current value: "128") MCA btl: parameter "btl_sm_cb_lazy_free_freq" (current value: "120") MCA btl: parameter "btl_sm_priority" (current value: "0") MCA btl: parameter "btl_base_warn_component_unused" (current value: "1") This parameter is used to turn on warning messages when certain NICs are not used
2011/3/28 Ralph Castain <r...@open-mpi.org> > The fact that this exactly matches the time you measured with shared memory > is suspicious. My guess is that you aren't actually using shared memory at > all. > > Does your "ompi_info" output show shared memory as being available? Jeff or > others may be able to give you some params that would let you check to see > if sm is actually being used between those procs. > > > > On Mar 28, 2011, at 7:51 AM, Michele Marena wrote: > > What happens with 2 processes on the same node with tcp? > With --mca btl self,tcp my app runs in 23s. > > 2011/3/28 Jeff Squyres (jsquyres) <jsquy...@cisco.com> > >> Ah, I didn't catch before that there were more variables than just tcp vs. >> shmem. >> >> What happens with 2 processes on the same node with tcp? >> >> Eg, when both procs are on the same node, are you thrashing caches or >> memory? >> >> Sent from my phone. No type good. >> >> On Mar 28, 2011, at 6:27 AM, "Michele Marena" <michelemar...@gmail.com> >> wrote: >> >> However, I thank you Tim, Ralh and Jeff. >> My sequential application runs in 24s (wall clock time). >> My parallel application runs in 13s with two processes on different nodes. >> With shared memory, when two processes are on the same node, my app runs >> in 23s. >> I'm not understand why. >> >> 2011/3/28 Jeff Squyres < <jsquy...@cisco.com>jsquy...@cisco.com> >> >>> If your program runs faster across 3 processes, 2 of which are local to >>> each other, with --mca btl tcp,self compared to --mca btl tcp,sm,self, then >>> something is very, very strange. >>> >>> Tim cites all kinds of things that can cause slowdowns, but it's still >>> very, very odd that simply enabling using the shared memory communications >>> channel in Open MPI *slows your overall application down*. >>> >>> How much does your application slow down in wall clock time? Seconds? >>> Minutes? Hours? (anything less than 1 second is in the noise) >>> >>> >>> >>> On Mar 27, 2011, at 10:33 AM, Ralph Castain wrote: >>> >>> > >>> > On Mar 27, 2011, at 7:37 AM, Tim Prince wrote: >>> > >>> >> On 3/27/2011 2:26 AM, Michele Marena wrote: >>> >>> Hi, >>> >>> My application performs good without shared memory utilization, but >>> with >>> >>> shared memory I get performance worst than without of it. >>> >>> Do I make a mistake? Don't I pay attention to something? >>> >>> I know OpenMPI uses /tmp directory to allocate shared memory and it >>> is >>> >>> in the local filesystem. >>> >>> >>> >> >>> >> I guess you mean shared memory message passing. Among relevant >>> parameters may be the message size where your implementation switches from >>> cached copy to non-temporal (if you are on a platform where that terminology >>> is used). If built with Intel compilers, for example, the copy may be >>> performed by intel_fast_memcpy, with a default setting which uses >>> non-temporal when the message exceeds about some preset size, e.g. 50% of >>> smallest L2 cache for that architecture. >>> >> A quick search for past posts seems to indicate that OpenMPI doesn't >>> itself invoke non-temporal, but there appear to be several useful articles >>> not connected with OpenMPI. >>> >> In case guesses aren't sufficient, it's often necessary to profile >>> (gprof, oprofile, Vtune, ....) to pin this down. >>> >> If shared message slows your application down, the question is whether >>> this is due to excessive eviction of data from cache; not a simple question, >>> as most recent CPUs have 3 levels of cache, and your application may require >>> more or less data which was in use prior to the message receipt, and may use >>> immediately only a small piece of a large message. >>> > >>> > There were several papers published in earlier years about shared >>> memory performance in the 1.2 series.There were known problems with that >>> implementation, which is why it was heavily revised for the 1.3/4 series. >>> > >>> > You might also look at the following links, though much of it has been >>> updated to the 1.3/4 series as we don't really support 1.2 any more: >>> > >>> > <http://www.open-mpi.org/faq/?category=sm> >>> http://www.open-mpi.org/faq/?category=sm >>> > >>> > <http://www.open-mpi.org/faq/?category=perftools> >>> http://www.open-mpi.org/faq/?category=perftools >>> > >>> > >>> >> >>> >> -- >>> >> Tim Prince >>> >> _______________________________________________ >>> >> users mailing list >>> >> <us...@open-mpi.org>us...@open-mpi.org >>> >> <http://www.open-mpi.org/mailman/listinfo.cgi/users> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> > >>> > >>> > _______________________________________________ >>> > users mailing list >>> > <us...@open-mpi.org>us...@open-mpi.org >>> > <http://www.open-mpi.org/mailman/listinfo.cgi/users> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> -- >>> Jeff Squyres >>> <jsquy...@cisco.com>jsquy...@cisco.com >>> For corporate legal information go to: >>> <http://www.cisco.com/web/about/doing_business/legal/cri/> >>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>> >>> >>> _______________________________________________ >>> users mailing list >>> <us...@open-mpi.org>us...@open-mpi.org >>> <http://www.open-mpi.org/mailman/listinfo.cgi/users> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >