[OMPI users] OpenMPI failed when running across two mac machines

2012-01-20 Thread Teng Lin
Hi, We are distributing OpenMPI as part of software suite. Therefore, the prefix we used for building is not expected to be the same when running on customer's machine. However, we did manage to get it running by setting OPLA_PREFIX, PATH and LD_LIBARAY_PATH on Linux). We tried do the same thin

[OMPI users] OpenMPI failed when running across two mac machines

2012-01-20 Thread Teng Lin
Hi, We are distributing OpenMPI as part of software suite. Therefore, the prefix we used for building is not expected to be the same when running on customer's machine. However, we did manage to get it running by setting OPLA_PREFIX, PATH and LD_LIBARAY_PATH on Linux). We tried do the same thin

Re: [OMPI users] deadlock when calling MPI_gatherv

2010-04-27 Thread Teng Lin
Hi Terry, > How does the stack for the non-SM BTL run look, I assume it probably is the > same? Also, can you dump the message queues for rank 1? What's interesting > is you have a bunch of pending receives, do you expect that to be the case > when the MPI_Gatherv occurred? It turns out we

Re: [OMPI users] deadlock when calling MPI_gatherv

2010-04-26 Thread Teng Lin
On Apr 26, 2010, at 9:07 PM, Trent Creekmore wrote: > You are going to have to debug and trace the program to find out where it is > stopping. > You may want to try using KDbg, a graphical front end for the command line > debugger dbg, which makes it a LOT easier, or use Eclipse. As a matter of

[OMPI users] deadlock when calling MPI_gatherv

2010-04-26 Thread Teng Lin
Hi, We recently ran into deadlock when calling MPI_gatherv with Open MPI 1.3.4. It seems to have something to do with sm at first. However, it still hangs even after turning off sm btl. Any idea how to track down the problem? Thanks, Teng # Stac

Re: [OMPI users] Bug report in plm_lsf_module.c

2010-04-26 Thread Teng Lin
Ralph, Thanks for the prompt response. On Apr 26, 2010, at 2:34 PM, Ralph Castain wrote: > Appreciate your input! None of the developers have access to an LSF machine > any more, so we can't test it :-/ > > What version of OMPI does this patch apply to? The patch is applied to 1.3.4, which is t

[OMPI users] Bug report in plm_lsf_module.c

2010-04-26 Thread Teng Lin
Hi, We recently identify a bug in our LSF cluster. The job always hang if all LSF related components present. One observation we have is that the job works fine after removing all LSF related components. Below message from stdout: [:24930] mca: base: components_open: Looking for ess compone

[OMPI users] OPAL_PREFIX is not passed to remote node in pls_rsh_module.c

2008-10-17 Thread Teng Lin
Hi All, We have bundled Open MPI with our product and shipped it to the customer. According to http://www.open-mpi.org/faq/?category=building#installdirs , Below is the command we used to launch MPI program: env OPAL_PREFIX=/path/to/openmpi \ /path/to/openmpi/bin//orterun --prefix /path/to/o

[OMPI users] 32-bit openib btl fails on 64-bit OS

2008-04-06 Thread Teng Lin
Dear All, In order to run a 32-bit program on a 64-bit cluster, one has to build 32-bit OpenMPI. Following some instructions on this mailing list, I successfully built OpenMPI 1.2.4 on 64-bit OS. However, I run into openib problem when I try to run hello_c program. I also built 64-bit Ope

[OMPI users] Job does not quit even when the simulation dies

2007-11-06 Thread Teng Lin
Hi, Just realize I have a job run for a long time, while some of the nodes already die. Is there any way to ask other nodes to quit ? [kyla-0-1.local:09741] mca_btl_tcp_frag_send: writev failed with errno=104 [kyla-0-1.local:09742] mca_btl_tcp_frag_send: writev failed with errno=104 T

[OMPI users] Bundling OpenMPI

2007-09-27 Thread Teng Lin
Hi, We would like to distribute OpenMPI along with our software to customers, is there any legal issue we need to know about? We can successfully build OpenMPI using ./configure --prefix=/some_path;make;make install However, if we do cp -r /some_path /other_path and try to run /other_pat