Re: [O-MPI users] Minor issue: Failthrough of MCA components.

2005-11-21 Thread Jeff Squyres
Although George fixed the MX-abort error, let me clarify the rationale here... You are correct that at run-time, OMPI tries to load an run every component that it finds. So if you have BTL components build for all interconnects, OMPI will query each of them at run-time and try to use the

Re: [O-MPI users] Minor issue: Failthrough of MCA components.

2005-11-21 Thread Troy Telford
On Mon, 21 Nov 2005 06:00:05 -0700, Jeff Squyres wrote: Although George fixed the MX-abort error, let me clarify the rationale here... You are correct that at run-time, OMPI tries to load an run every component that it finds. So if you have BTL components build for all interconnects, OMPI w

Re: [O-MPI users] Minor issue: Failthrough of MCA components.

2005-11-21 Thread Brian Barrett
On Nov 21, 2005, at 10:34 AM, Troy Telford wrote: On Mon, 21 Nov 2005 06:00:05 -0700, Jeff Squyres mpi.org> wrote: These warnings will likely be removed (or, more specifically, only displayed if requested) once we include the feature to display which BTL components/networks are being used at r

[O-MPI users] problem with overflow 1.8ab code using GM

2005-11-21 Thread Borenstein, Bernard S
Things have improved alot since I ran the code using the earlier betas, but it still fails near the end of the run. <> The error messages are : FOR GRID 4 AT STEP 466 L2NORM = 0.74601987E-09 FOR GRID 5 AT STEP 466 L2NORM = 0.86085437E-08 FOR GRID 6 AT

[O-MPI users] another overflow 1.8ab problem

2005-11-21 Thread Borenstein, Bernard S
Just tried to run a very large case on another cluster with TCP. It cranks away for quite awhile then I get this message : FOR GRID 78 AT STEP 733 L2NORM = 0.30385345E-03^M FOR GRID 79 AT STEP 733 L2NORM = 0.26182533E+00^M [hsd660:02490] spawn: in job_state_callback(jobi

Re: [O-MPI users] problem with overflow 1.8ab code using GM

2005-11-21 Thread Galen M. Shipman
Bernard, This code is using MPI_Alloc_mem, which is good. Do you have an idea of approx. how much memory has been allocated via MPI_Alloc_mem at the time of failure? Thanks, Galen On Mon, 21 Nov 2005, Borenstein, Bernard S wrote: Things have improved alot since I ran the code using the e

Re: [O-MPI users] Configuring port

2005-11-21 Thread Enrique Curchitser
OK, I'll take you up on the offer. I have 4 Power Mac G5's on a private network connected through a GigE switch. Even for large problems the communications are slugish. This same code has shown to scale to upwards of 128 processors on IBM SP's. So here is the output to ompi_info --param bt