! if( (status = mx_get_info( mx_btl->mx_endpoint, MX_LINE_SPEED,
!&nic_id, sizeof(nic_id),
&value, sizeof(int))) != MX_SUCCESS )
{
yes, a NIC ID is required for this call because a host may have multiple
NICs with
Just to brainstorm on this a little - the two different clusters will have
different "mapper IDs", and this can be learned via the attached code
snippet. As long as fma is the mapper (as opposed the the older, deprecated
"gm_mapper" or "mx_mapper"), then Myrinet topology rules ensure that NIC 0
What version of GM are you running?
# rpm -qa |egrep "^gm-[0-9]+|^gm-devel"
gm-2.0.24-1
gm-devel-2.0.24-1
Is this too old?
Nope, that's just fine.
A mismatch between the list
of nodes actually configured onto the Myrinet fabric and the machine file
is
a common source of errors like this. T
I'm having difficulty with running a simple hello world OpenMPI
program over Myrinet gm interconnect - please see the log at the end
of this email. The error is tripped by a call to the function
gm_global_id_to_node_id(
gm_btl->port,
gm_endpoint->endpoint_addr.global_id,
&gm_e
[repost - apologies, apparently my first one was unintentionally a
followup to another thread]
If you ever do an opal_output() with a "%p" in the format string,
guess_strlen() can segfault because it neglects to consume the corresponding
argument, causing subsequent "%s" in the same format strin
If you ever do an opal_output() with a "%p" in the format string,
guess_strlen() can segfault because it neglects to consume the corresponding
argument, causing subsequent "%s" in the same format string to blow up in
strlen() on a bad address. Any objections to the following patch to add %p
su
Right, that's the maximum number of open MX channels, i.e. processes
than can run on the node using MX. With MX (1.2.0c I think), I get
weird messages if I run a second mpirun quickly after the first one
failed. The myrinet guys, I quite sure, can explain why and how.
Somehow, when an application
Second thing. From one of your previous emails, I see that MX
is configured with 4 instance by node. Your running with
exactly 4 processes on the first 2 nodes. Weirds things might
happens ...
4 processes per node will be just fine. This is not like GM where the 4
includes some "reserved" port
$ mpirun --prefix /usr/local/openmpi-1.2b2 --hostfile ./h1-3 -np 1 --mca
btl mx,sm,self ./cpi
[node-1:09704] mca: base: component_find: unable to open mtl mx: file
not found (ignored)
[node-1:09704] mca: base: component_find: unable to open btl mx: file
not found (ignored)
This in particular is
As for the MTL, there is a bug in the MX
MTL for v1.2 that has been fixed, but after 1.2b2 ...
oops, i was stupidly assuming he already had that fix. yes, this is an
important fix...
-reese
Ompi failing on mx only> I've attached the ompi_info from node-1 and node-2.
thanks, but i need "mx_info", not "ompi_info" ;-)
But now that you mention mapper, I take it that's what SEGV_MAPERR might
be referring to.
this is an ompi red herring; it has nothing to do with Myrinet mapping, even
Ompi failing on mx onlyHi, Gary-
This looks like a config problem, and not a code problem yet. Could you send
the output of mx_info from node-1 and from node-2? Also, forgive me
counter-asking a possibly dumb OMPI question, but is "-x LD_LIBRARY_PATH"
really what you want, as opposed to "-x LD
Well I have no luck in finding a way to up the amount the system will
allow GM to use. What is a recommended solution? Is this even a
problem in most cases? Like am i encountering a corner case?
upping the limit was not what i'm suggesting as a fix, just pointing out
that it is kind of low an
GM: gm_register_memory will be able to lock XXX pages (YYY MBytes)
Is there a way to tell GM to pull more memory from the system?
GM reserves all IOMMU space that the OS is willing to give it, so what is
needed is a way to tell the OS and/or machine to allow a bigger chunk of
IOMMU space to b
Also I have no idea what the memory window question is, i will
look it up on google.
aon075:~ root# dmesg | grep GM
GM: gm_register_memory will be able to lock 96000 pages (375 MBytes)
This just answered it - there is 375MB available for GM to register, which
is the IOMMU window size available
I have tried moving around machines that the run is done on to the
same result in multiple places.
The error is:
[aon049.engin.umich.edu:21866] [mpool_gm_module.c:100] error(8)
registering gm memory
This is on a PPC-based OSX system? How many MPI processes per node are you
starting? And I as
This is due to a problem in (void *)->(uint64_t_ conversion in OMPI. The
following patch fixes the problem, as would an appropriate cast of pval, I
suspect. The problem is an inappropriate use of ompi_ptr_t. I would guess
that other uses of lval might be suspect also (such as in the Portals c
17 matches
Mail list logo