Re: [OMPI users] segmentation faults

2007-08-14 Thread Adams, Samuel D Contr AFRL/HEDR
ble to run this through valgrind or some other memory- checking debugger? It looks like the single process case may be the simplest to check...? On Aug 13, 2007, at 5:03 PM, Adams, Samuel D Contr AFRL/HEDR wrote: > I tried to run a code that I have running for a while now this > mo

[OMPI users] segmentation faults

2007-08-13 Thread Adams, Samuel D Contr AFRL/HEDR
I tried to run a code that I have running for a while now this morning, but for some reason it is causing segmentation faults. I can't really think of anything that I have done recently that would be causing these errors. Does anyone have any idea? I get this running it on more than one processo

Re: [OMPI users] torque and openmpi

2007-08-01 Thread Adams, Samuel D Contr AFRL/HEDR
I reran the configure script with the --with-tm flag this time. Thanks for the info. It was working before for clients with ssh properly configured (i.e. my account only). But now it is working without having to use ssh for all accounts (i.e. biologist and physicists users). Sam Adams General

Re: [OMPI users] torque and openmpi

2007-07-27 Thread Adams, Samuel D Contr AFRL/HEDR
sh to launch remote processes; we use the internal TM API in Torque). On Jul 27, 2007, at 11:38 AM, Adams, Samuel D Contr AFRL/HEDR wrote: > I deleted all of the entries out of the know_hosts file, but that > didn't > seem to help. I can run jobs just fine without torque on multi

Re: [OMPI users] torque and openmpi

2007-07-27 Thread Adams, Samuel D Contr AFRL/HEDR
f the connection failed because a wrong ssh key. Clean your .ssh/ known_hosts and the problem will vanish. Thanks, george. On Jul 27, 2007, at 11:01 AM, Adams, Samuel D Contr AFRL/HEDR wrote: > When I run jobs with torque, I get this error message. Any ideas? > > [sam@prodn

[OMPI users] torque and openmpi

2007-07-27 Thread Adams, Samuel D Contr AFRL/HEDR
When I run jobs with torque, I get this error message. Any ideas? [sam@prodnode1 all]$ cat script.sh.err Host key verification failed. [prodnode3.brooks.af.mil:03321] [0,0,0] ORTE_ERROR_LOG: Timeout in file base/pls_base_orted_cmds.c at line 275 [prodnode3.brooks.af.mil:03321] [0,0,0] ORTE_ERROR

[OMPI users] nfs romio

2007-07-02 Thread Adams, Samuel D Contr AFRL/HEDR
I wrote a code the other day to MPI IO to write files, and OpenMPI/ROMIO seems to be having problems with the NFS server. I have read that NFSv4 and ROMIO don't perform very well together, but my problem is probably a misconfiguration of some kind. Basically when I use MPI IO to write a file on an

[OMPI users] initial setup

2007-06-27 Thread Adams, Samuel D Contr AFRL/HEDR
I have something wrong with having my LD_LIBRARY_PATH variable path set when I am doing a non-interactive login to remote nodes. These nodes are running RHEL5. It looks like something like this: [sam@prodnode1 fdtd_0.3]$ time mp

[OMPI users] ethernet bonding

2007-05-24 Thread Adams, Samuel D Contr AFRL/HEDR
We recently got 33 new cluster nodes all of which have two onboard GigE nics. We also got two powerconnect 2748 48 port switches which support IEEE 802.3ad (link aggregation). I have configured the nodes to do Ethernet bonding to aggregate the two nics in to one bonded device: http://www.cyberci

Re: [OMPI users] job running question

2006-04-12 Thread Adams Samuel D Contr AFRL/HEDR
ding on whether you login interactively or not. Also, I don't think ~/.bashrc is read on a noninteractive login. Michael On Apr 10, 2006, at 1:06 PM, Adams Samuel D Contr AFRL/HEDR wrote: > I put in /etc/bashrc and opened a new shell, but I still am not > seeing any > core files. >

Re: [OMPI users] job running question

2006-04-10 Thread Adams Samuel D Contr AFRL/HEDR
n file (~/.bashrc in BASH case ant ~/.tcshrc in TCSH/CSH case) you will find your core files :) Regards, Pavel Shamis (Pasha) Adams Samuel D Contr AFRL/HEDR wrote: > I set bash to have unlimited size core files like this: > > $ ulimit -c unlimited > > But, it was not dropping co

Re: [OMPI users] job running question

2006-04-10 Thread Adams Samuel D Contr AFRL/HEDR
you get a corefile; can you send the backtrace from that corefile? > -Original Message- > From: users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf Of Adams Samuel > D Contr AFRL/HEDR > Sent: Friday, April 07, 2006 11:53 AM > To: 'us...@open-mpi

[OMPI users] job running question

2006-04-07 Thread Adams Samuel D Contr AFRL/HEDR
We are trying to build a new cluster running OpenMPI. We were previous running LAM-MPI. To run jobs we would do the following: $ lamboot lam-host-file $ mpirun C program I am not sure if this works more or less the same way with ompi. We were trying to run it like this: $ [james.parker@Cent01

[OMPI users] OMPI 1.0.1, CentOS 4.2 and gcc4

2006-03-29 Thread Adams Samuel D Contr AFRL/HEDR
dIt seems like this should be a simple problem. I am trying to get OpenMPI to compile on a CentOS 4.2 (like Redhat EL 4.2) box. It has installed gcc 3.4, and gcc 4.0. I want to compile OMPI with gcc4, but I am getting this error. What am I doing wrong? [root@Cent01 openmpi-1.0.1]# CC=gcc4 CP

Re: [OMPI users] Building OpenMPI with Lahey Fortran 95

2006-03-03 Thread Adams Samuel D Contr AFRL/HEDR
l Dynamics - Network Systems Phone: 210.536.5945 -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Adams Samuel D Contr AFRL/HEDR Sent: Thursday, March 02, 2006 10:08 AM To: 'us...@open-mpi.org' Subject: [OMPI users] Building OpenM

[OMPI users] Building OpenMPI with Lahey Fortran 95

2006-03-02 Thread Adams Samuel D Contr AFRL/HEDR
I am trying to build OpenMPI using Lahey Fortran 95 6.2 on a Fedora Core 3 box. I run the configure script ok, but the problem occurs when run make. It appears that it is bombing out when it is building the Fortran libraries. It seems like to me that OpenMPI is naming its modules with .ompi_mod in