Re: [OMPI users] openMPI on Xgrid

2010-03-29 Thread Klymak Jody
I have an environment a few trusted users could use to test. However, I have neither the expertise or time to do the debugging myself. Cheers, Jody On 2010-03-29, at 1:27 PM, Jeff Squyres wrote: On Mar 29, 2010, at 4:11 PM, Cristobal Navarro wrote: i realized that xcode dev tools inclu

Re: [OMPI users] openmpi with xgrid

2009-08-15 Thread Klymak Jody
On 15-Aug-09, at 1:03 AM, Alan wrote: Thanks Warner, This is frustrating... I read the ticket. 6 months already and 2 releases postponed... Frankly, I am very skeptical that this will be fixed for 1.3.4. I really hope so, but when 1.3.4 will be released? I have to think about going with

Re: [OMPI users] torque pbs behaviour...

2009-08-11 Thread Klymak Jody
On 11-Aug-09, at 6:16 AM, Jeff Squyres wrote: This means that OMPI is finding an mca_iof_proxy.la file at run time from a prior version of Open MPI. You might want to use "find" or "locate" to search your nodes and find it. I suspect that you somehow have an OMPI 1.3.x install that overl

Re: [OMPI users] torque pbs behaviour...

2009-08-11 Thread Klymak Jody
ble is dynamically linking to? Can I rebuild openmpi statically? Thanks, Jody On Tue, Aug 11, 2009 at 7:43 AM, Klymak Jody wrote: On 11-Aug-09, at 6:28 AM, Ralph Castain wrote: The reason your job is hanging is sitting in the orte-ps output. You have multiple processes declaring themselve

Re: [OMPI users] torque pbs behaviour...

2009-08-11 Thread Klymak Jody
On 11-Aug-09, at 6:28 AM, Ralph Castain wrote: -mca plm_base_verbose 5 --debug-daemons -mca odls_base_verbose 5 I'm afraid the output will be a tad verbose, but I would appreciate seeing it. Might also tell us something about the lib issue. Command line was: /usr/local/openmpi/bin/mpirun

Re: [OMPI users] torque pbs behaviour...

2009-08-11 Thread Klymak Jody
On 11-Aug-09, at 6:28 AM, Ralph Castain wrote: The reason your job is hanging is sitting in the orte-ps output. You have multiple processes declaring themselves to be the same MPI rank. That definitely won't work. Its the "local rank" if that makes any difference... Any thoughts on this o

Re: [OMPI users] torque pbs behaviour...

2009-08-11 Thread Klymak Jody
On 10-Aug-09, at 8:03 PM, Ralph Castain wrote: Interesting! Well, I always make sure I have my personal OMPI build before any system stuff, and I work exclusively on Mac OS-X: I am still finding this very mysterious I have removed all the OS-X -supplied libraries, recompiled and insta

Re: [OMPI users] torque pbs behaviour...

2009-08-10 Thread Klymak Jody
dylib (compatibility version 1.0.0, current version 1.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 111.1.3) Thanks, Jody On Aug 10, 2009, at 8:53 PM, Klymak Jody wrote: On 10-Aug-09, at 6:44 PM, Ralph Castain wrote: Check your LD_LIBRARY_PATH - there i

Re: [OMPI users] torque pbs behaviour...

2009-08-10 Thread Klymak Jody
$DYLD_LIBRARY_PATH $LD_LIBRARY_PATH /usr/local/openmpi/lib: /usr/local/openmpi/lib: So I'm afraid I'm stumped again. I suppose I could go clean out all the libraries in /usr/lib/... Thanks again, sorry to be a pain... Cheers, Jody On Aug 10, 2009, at 7:38 PM, Klymak Jody

Re: [OMPI users] torque pbs behaviour...

2009-08-10 Thread Klymak Jody
So, mpirun --display-allocation -pernode --display-map hostname gives me the output below. Simple jobs seem to run, but the MITgcm does not, either under ssh or torque. It hangs at some early point in execution before anything is written, so its hard for me to tell what the error is. Co

Re: [OMPI users] 2 to 1 oversubscription

2009-07-15 Thread Klymak Jody
Hi Robert, Sorry if this is offtopic for the more knowledgeable here... On 14-Jul-09, at 7:50 PM, Robert Kubrick wrote: By setting processor affinity you can force execution of each process on a specific core, thus limiting context switching. I know affinity wasn't supported on MacOS last ye

Re: [OMPI users] 2 to 1 oversubscription

2009-07-14 Thread Klymak Jody
On 14-Jul-09, at 5:14 PM, Robert Kubrick wrote: Jody, Just to make sure, you did set processor affinity during your test right? I'm not sure what that means in the context of OS X. Hyperthreading was turned on. Cheers, Jody On Jul 13, 2009, at 9:28 PM, Klymak Jody wrote: Hi R

Re: [OMPI users] 2 to 1 oversubscription

2009-07-13 Thread Klymak Jody
Hi Robert, I got inspired by your question to run a few more tests. They are crude, and I don't have actual cpu timing information because of a library mismatch. However: Setup: Xserve, 2x2.26 GHz Quad-core Intel Xeon 6.0 Gb memory 1067 MHz DDR3 Mac OS X 10.5.6 Nodes are connected with a

Re: [OMPI users] Xgrid and choosing agents...

2009-07-12 Thread Klymak Jody
Hi Ralph, On 12-Jul-09, at 4:07 AM, Ralph Castain wrote: Assuming that Scoreboard is appropriately licensed (i.e., is not licensed under GPL, but preferably something like FreeBSD), and that it has an accessible API, then we can link against it when in that environment and interact any way

Re: [OMPI users] Xgrid and choosing agents...

2009-07-12 Thread Klymak Jody
chine and compute a "score". Nodes with the highest score get the job. However, how one would implement that using openMPI is unclear to me. Does openMPI have the capability of passing arbitrary arguments to the resource managers? Thanks, Jody Regards. Vitorio. Le 09-07

Re: [OMPI users] 2 to 1 oversubscription

2009-07-11 Thread Klymak Jody
Hi Robert, I ran some very crude tests and found that things slowed down once you got over 8 cores at a time. However, they didn't slow down by 50% if you went to 16 processes. Sadly, the tests were so crude, I did not keep good notes (it appears). I'm running a gcm, so my benchmarks ma

Re: [OMPI users] Xgrid and choosing agents...

2009-07-11 Thread Klymak Jody
sure you'll get the rank layout you'll want, though...or if that is important to what you are doing. Ralph On Jul 11, 2009, at 1:18 PM, Klymak Jody wrote: Hi Vitorio, Thanks for getting back to me! My hostfile is xserve01.local max-slots=8 xserve02.local max-slots=8 xserve03.local m

Re: [OMPI users] Xgrid and choosing agents...

2009-07-11 Thread Klymak Jody
d we absolutely # want to disallow over-subscribing it: yow.example.com slots=4 max-slots=4 so in your case like mine you should have something like: your.hostname.domain slots=8 max-slots=8 # for each node I hope this will help you. Regards. Vitorio. Le 09-07-11 à 10:56, Klymak Jody a écrit : Hi

[OMPI users] Xgrid and choosing agents...

2009-07-11 Thread Klymak Jody
Hi all, Sorry in advance if these are naive questions - I'm not experienced in running a grid... I'm using openMPI on 4 duo Quad-core Xeon xserves. The 8 cores mimic 16 cores and show up in xgrid as each agent having 16 processors. However, the processing speed goes down as the used pr