Re: [OMPI users] InfiniBand, different OpenFabrics transport types

2011-07-19 Thread Bill Johnstone
Yevgeny, Sorry for the delay in replying -- I'd been out for a few days. - Original Message - > From: Yevgeny Kliteynik > Sent: Thursday, July 14, 2011 12:51 AM > Subject: Re: [OMPI users] InfiniBand, different OpenFabrics transport types   > While I'm trying to find an old HCA somewhe

Re: [OMPI users] InfiniBand, different OpenFabrics transport types

2011-07-11 Thread Bill Johnstone
Hi Yevgeny and list, - Original Message - > From: Yevgeny Kliteynik > I'll check the MCA_BTL_OPENIB_TRANSPORT_UNKNOWN thing and get back to you. Thank you. > One question though, just to make sure we're on the same page: so the jobs > do run OK on > the older HCAs, as long as they

Re: [OMPI users] InfiniBand, different OpenFabrics transport types

2011-07-08 Thread Bill Johnstone
Hello, and thanks for the reply. - Original Message - > From: Jeff Squyres > Sent: Thursday, July 7, 2011 5:14 PM > Subject: Re: [OMPI users] InfiniBand, different OpenFabrics transport types > > On Jun 28, 2011, at 1:46 PM, Bill Johnstone wrote: > >> I have

[OMPI users] InfiniBand, different OpenFabrics transport types

2011-06-28 Thread Bill Johnstone
Hello all. I have a heterogeneous network of InfiniBand-equipped hosts which are all connected to the same backbone switch, an older SDR 10 Gb/s unit. One set of nodes uses the Mellanox "ib_mthca" driver, while the other uses the "mlx4" driver. This is on Linux 2.6.32, with Open MPI 1.5.3 .

Re: [OMPI users] BLCR support not building on 1.5.3

2011-05-27 Thread Bill Johnstone
Hello, Thank you very much for this.  I've replied further below: - Original Message - > From: Joshua Hursey [...] > What other configure options are you passing to Open MPI? Specifically the > configure test will always fail if '--with-ft=cr' is not specified - by > default Open MPI

[OMPI users] BLCR support not building on 1.5.3

2011-05-26 Thread Bill Johnstone
Hello all. I'm building 1.5.3 from source on a Debian Squeeze AMD64 system, and trying to get BLCR support built-in.  I've installed all the packages that I think should be relevant to BLCR support, including: +blcr-dkms +libcr0 +libcr-dev +blcr-util I've also installed blcr-testuite .  I only

Re: [OMPI users] Making RPM from source that respects --prefix

2009-10-07 Thread Bill Johnstone
Hello Jeff and Kiril, Thank you for your responses. Based on the information you both provided, I was able to get buildrpm to make the OMPI RPM the way I wanted. I ended up having to define _prefix , _mandir , and _infodir . Additionally, I found I had to use --define "shell_scripts_basename

[OMPI users] Making RPM from source that respects --prefix

2009-10-02 Thread Bill Johnstone
I'm trying to build an RPM of 1.3.3 from the SRPM. Despite typical RPM practice, I need to build ompi so that it installs to a different directory from /usr or /opt, i.e. what I would get if I just built from source myself with a --prefix argument to configure. When I invoke buildrpm with the

[OMPI users] mpirun (orte ?) not shutting down cleanly on job aborts

2008-06-09 Thread Bill Johnstone
Hello OMPI devs, I'm currently running OMPI v 1.2.4 . It didn't seem that any bugs which affect me or my users were fixed in 1.2.5 and 1.2.6, so I haven't upgraded yet. When I was initially getting started with OpenMPI, I had some problems which I was able to solve, but one still remains. As

[OMPI users] Documentation on running under slurm

2008-06-05 Thread Bill Johnstone
Hello all. It would seem that the documentation, at least the FAQ page at http://www.open-mpi.org/faq/?category=slurm is a little out of date with respect to running on newer versions of SLURM (I just got things working with version 1.3.3) . According to the SLURM documentation, srun -A is dep

[OMPI users] SLURM vs. Torque?

2007-10-22 Thread Bill Johnstone
Hello All. We are starting to need resource/scheduling management for our small cluster, and I was wondering if any of you could provide comments on what you think about Torque vs. SLURM? On the basis of the appearance of active development as well as the documentation, SLURM seems to be superior

Re: [OMPI users] mpirun hanging followup

2007-07-18 Thread Bill Johnstone
--- Ralph Castain wrote: > Unfortunately, we don't have more debug statements internal to that > function. I'll have to create a patch for you that will add some so > we can > better understand why it is failing - will try to send it to you on > Wed. Thank you for the patch you sent. I solved

Re: [OMPI users] mpirun hanging followup

2007-07-18 Thread Bill Johnstone
--- Ralph Castain wrote: > No, the session directory is created in the tmpdir - we don't create > anything anywhere else, nor do we write any executables anywhere. In the case where the TMPDIR env variable isn't specified, what is the default assumed by Open MPI/orte? > Just out of curiosity: a

Re: [OMPI users] mpirun hanging followup

2007-07-17 Thread Bill Johnstone
ble, or (to set it just for us) using -mca > tmpdir_base foo > on the mpirun command (or you can set OMPI_MCA_tmpidir_base=foo in > your > environment), where "foo" is the root of your tmp directory you want > us to > use (e.g., /tmp). > > Hope that helps > Ralph

Re: [OMPI users] mpirun hanging followup

2007-07-17 Thread Bill Johnstone
output so we can at least > > have a starting point. > >george. > > On Jul 17, 2007, at 2:46 PM, Bill Johnstone wrote: > > > George Bosilca wrote: > > > >> You can start by adding --debug-daemons and --debug to your mpirun > >> command lin

Re: [OMPI users] mpirun hanging followup

2007-07-17 Thread Bill Johnstone
George Bosilca wrote: > You can start by adding --debug-daemons and --debug to your mpirun > command line. This will generate a lot of output related to the > operations done internally by the launcher. If you send this output > to the list we might be able to help you a little bit more. OK, I ad

Re: [OMPI users] mpirun hanging followup

2007-07-17 Thread Bill Johnstone
Thanks for the help. I've replied below. --- "G.O." wrote: > 1- Check to make sure that there are no firewalls blocking > traffic between the nodes. There is no firewall in-between the nodes. If I run jobs directly via ssh, e.g. "ssh node4 env" they work. > 2 - Check to make sure tha

[OMPI users] mpirun hanging followup

2007-07-17 Thread Bill Johnstone
Hello all. I could really use help trying to figure out why mpirun is hanging as detailed in my previous message yesterday, 16 July. Since there's been no response, please allow me to give a short summary. -Open MPI 1.2.3 on GNU/Linux, 2.6.21 kernel, gcc 4.1.2, bash 3.2.15 is default shell -Open

[OMPI users] mpirun hangs on remote nodes -- how to find where and why?

2007-07-16 Thread Bill Johnstone
Hello. I'm trying to use Open MPI 1.2.3 on a cluster of dual-processor AMD64 nodes. These nodes are all connected via gigabit ethernet on a private, self-contained IP network. The OS is GNU/Linux, gcc 4.1.2, kernel 2.6.21 . Open MPI was configured with --prefix=/usr/local and installed via make