Hi Rob,
thanks for your comments. I understand that it's most probably not worth
the effort to find the actual reason.
Because I have to deal with very large files I preferred using
"std::numeric_limits::max()" rather than a hard-coded value
to split the read in case an IO request exceeds this am
Hello Reuti,
> defining 12 slots and request the machines exclusive is not an option?
I would like to. Unfortunatly the system is productive (for 2 years now) and
many
scripts depend on this setup.
>
> The only way to get it working otherwise is to unset $JOB_ID and so
> on, so that Open MPI
Agreed that the original program had the char*[20]/char[20] bug, but his segv
is occurring before trying to use that array. So it's a bug - but he just
hadn't hit it yet. :-)
I'd still like to see a debugging version so that we can get a real stack
trace, and/or try the latest 1.4.4 RC (poste
Hello All,
I have just rebuilt openmpi-1.4-3 on our cluster, and I see this error:
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environme
I am getting some undefined references in building OpenMPI 1.5.4 and I would
like to know how to work around it.
The errors look like this:
/scratch1/bloscel/builds/release/openmpi-intel/lib/libmpi.a(topology-linux.o):
In function `hwloc_linux_alloc_membind':
topology-linux.c:(.text+0x1da): und
Yowza; that sounds like a configury bug. :-(
What line were you using to configure Open MPI? Do you have libnuma installed?
If so, do you have the .h and .so files? Do you have the .a file?
Can you send the last few lines of output from a failed "make V=1" in that
tree? (it'll show us the
Hi,
Interestingly, the errors are gone after I removed "-g" from the app
compile options.
I tested again on the fresh Ubuntu 11.10 install: both 1.4.3 and 1.5.4
compile fine, but with the same error.
Also I tried hard to find any 32-bit object or library and failed.
They all are 64-bit.
- D.
20
Le 28/09/2011 17:55, Blosch, Edwin L a écrit :
>
> I am getting some undefined references in building OpenMPI 1.5.4 and I
> would like to know how to work around it.
>
>
>
> The errors look like this:
>
>
>
> /scratch1/bloscel/builds/release/openmpi-intel/lib/libmpi.a(topology-linux.o):
> In fu
On 09/27/2011 05:30 PM, Jeff Squyres wrote:
> On Sep 27, 2011, at 5:03 PM, Prentice Bisbal wrote:
>
>> To clarify, is IP/Ethernet required, or will IPoIB be used if it's
>> configured on the nodes? Would this make a difference.
>
> IPoIB is fine, although I've heard concerns about its stability a
Am 28.09.2011 um 18:09 schrieb Brice Goglin:
Le 28/09/2011 17:55, Blosch, Edwin L a écrit :
I am getting some undefined references in building OpenMPI 1.5.4
and I would like to know how to work around it.
The errors look like this:
/scratch1/bloscel/builds/release/openmpi-intel/lib/
I am wondering what the proper way of stop a mpirun process and the child
process it created. I tried to send SIGTERM, it does not respond to it ?
What kind of signal should I be sending to it ?
Thanks
Xin
Jeff,
I've tried it now adding --without-libnuma. Actually that did NOT fix the
problem, so I can send you the full output from configure if you want, to
understand why this "hwloc" function is trying to use a function which appears
to be unavailable. The answers to some of your questions:
I tried 1.4.4rc4, same problem. Where do I get a debugging version?
On 9/28/11 8:32 AM, Jeff Squyres wrote:
Agreed that the original program had the char*[20]/char[20] bug, but his segv
is occurring before trying to use that array. So it's a bug - but he just
hadn't hit it yet. :-)
I'd stil
Use --enable-debug on your configure line. This will add in some debugging
code to OMPI, and it'll compile everything with -g so that you can get stack
traces.
Beware that the extra debugging junk makes OMPI slightly slower; don't do any
benchmarking with this install, etc.
On Sep 28, 2011,
14 matches
Mail list logo