Hello,
I am using Sun HPC Toolkit 7.0 to compile and run my C MPI programs.
I have tested the myrinet installations using myricoms own test programs.
The Myricom software stack I am using is MX and the vesrion is
mx2g-1.1.7, mx_mapper is also used.
We have 4 nodes having 8 dual core processors each (Sun Fire v890) and
the operating system is
Solaris 10 (SunOS indus1 5.10 Generic_125100-10 sun4u sparc
SUNW,Sun-Fire-V890).
The contents of machine file are:
indus1
indus2
indus3
indus4
The output of *mx_info* on each node is given below
=====*=
indus1
*======
MX Version: 1.1.7rc3cvs1_1_fixes
MX Build: @indus4:/opt/mx2g-1.1.7rc3 Thu May 31 11:36:59 PKT 2007
2 Myrinet boards installed.
The MX driver is configured to support up to 4 instances and 1024 nodes.
===================================================================
Instance #0: 333.2 MHz LANai, 66.7 MHz PCI bus, 2 MB SRAM
Status: Running, P0: Link up
MAC Address: 00:60:dd:47:ad:7c
Product code: M3F-PCIXF-2
Part number: 09-03392
Serial number: 297218
Mapper: 00:60:dd:47:b3:e8, version = 0x7677b8ba, configured
Mapped hosts: 10
ROUTE COUNT
INDEX MAC ADDRESS HOST NAME P0
----- -----------
--------- ---
0) 00:60:dd:47:ad:7c indus1:0 1,1
2) 00:60:dd:47:ad:68 indus4:0 8,3
3) 00:60:dd:47:b3:e8 indus4:1 7,3
4) 00:60:dd:47:b3:ab indus2:0 7,3
5) 00:60:dd:47:ad:66 indus3:0 8,3
6) 00:60:dd:47:ad:76 indus3:1 8,3
7) 00:60:dd:47:ad:77 jhelum1:0 8,3
8) 00:60:dd:47:b3:5a ravi2:0 8,3
9) 00:60:dd:47:ad:5f ravi2:1 1,1
10) 00:60:dd:47:b3:bf ravi1:0 8,3
===================================================================
======
*indus2*
======
MX Version: 1.1.7rc3cvs1_1_fixes
MX Build: @indus2:/opt/mx2g-1.1.7rc3 Thu May 31 11:24:03 PKT 2007
2 Myrinet boards installed.
The MX driver is configured to support up to 4 instances and 1024 nodes.
===================================================================
Instance #0: 333.2 MHz LANai, 66.7 MHz PCI bus, 2 MB SRAM
Status: Running, P0: Link up
MAC Address: 00:60:dd:47:b3:ab
Product code: M3F-PCIXF-2
Part number: 09-03392
Serial number: 296636
Mapper: 00:60:dd:47:b3:e8, version = 0x7677b8ba, configured
Mapped hosts: 10
ROUTE
COUNT
INDEX MAC ADDRESS HOST NAME P0
----- ----------- --------- ---
0) 00:60:dd:47:b3:ab indus2:0 1,1
2) 00:60:dd:47:ad:68 indus4:0 1,1
3) 00:60:dd:47:b3:e8 indus4:1 8,3
4) 00:60:dd:47:ad:66 indus3:0 1,1
5) 00:60:dd:47:ad:76 indus3:1 7,3
6) 00:60:dd:47:ad:77 jhelum1:0 7,3
8) 00:60:dd:47:ad:7c indus1:0 8,3
9) 00:60:dd:47:b3:5a ravi2:0 8,3
10) 00:60:dd:47:ad:5f ravi2:1 8,3
11) 00:60:dd:47:b3:bf ravi1:0 7,3
===================================================================
Instance #1: 333.2 MHz LANai, 66.7 MHz PCI bus, 2 MB SRAM
Status: Running, P0: Link down
MAC Address: 00:60:dd:47:b3:c3
Product code: M3F-PCIXF-2
Part number: 09-03392
Serial number: 296612
Mapper: 00:60:dd:47:b3:e8, version = 0x7677b8ba, configured
Mapped hosts: 10
======
*indus3*
======
MX Version: 1.1.7rc3cvs1_1_fixes
MX Build: @indus3:/opt/mx2g-1.1.7rc3 Thu May 31 11:29:03 PKT 2007
2 Myrinet boards installed.
The MX driver is configured to support up to 4 instances and 1024 nodes.
===================================================================
Instance #0: 333.2 MHz LANai, 66.7 MHz PCI bus, 2 MB SRAM
Status: Running, P0: Link up
MAC Address: 00:60:dd:47:ad:66
Product code: M3F-PCIXF-2
Part number: 09-03392
Serial number: 297240
Mapper: 00:60:dd:47:b3:e8, version = 0x7677b8ba, configured
Mapped hosts: 10
ROUTE
COUNT
INDEX MAC ADDRESS HOST NAME P0
----- ----------- --------- ---
0) 00:60:dd:47:ad:66 indus3:0 1,1
1) 00:60:dd:47:ad:76 indus3:1 8,3
2) 00:60:dd:47:ad:68 indus4:0 1,1
3) 00:60:dd:47:b3:e8 indus4:1 6,3
4) 00:60:dd:47:ad:77 jhelum1:0 8,3
5) 00:60:dd:47:b3:ab indus2:0 1,1
7) 00:60:dd:47:ad:7c indus1:0 8,3
8) 00:60:dd:47:b3:5a ravi2:0 8,3
9) 00:60:dd:47:ad:5f ravi2:1 7,3
10) 00:60:dd:47:b3:bf ravi1:0 8,3
===================================================================
Instance #1: 333.2 MHz LANai, 66.7 MHz PCI bus, 2 MB SRAM
Status: Running, P0: Link up
MAC Address: 00:60:dd:47:ad:76
Product code: M3F-PCIXF-2
Part number: 09-03392
Serial number: 297224
Mapper: 00:60:dd:47:b3:e8, version = 0x7677b8ba, configured
Mapped hosts: 10
ROUTE
COUNT
INDEX MAC ADDRESS HOST NAME P0
----- ----------- --------- ---
0) 00:60:dd:47:ad:66 indus3:0 8,3
1) 00:60:dd:47:ad:76 indus3:1 1,1
2) 00:60:dd:47:ad:68 indus4:0 7,3
3) 00:60:dd:47:b3:e8 indus4:1 1,1
4) 00:60:dd:47:ad:77 jhelum1:0 1,1
5) 00:60:dd:47:b3:ab indus2:0 7,3
7) 00:60:dd:47:ad:7c indus1:0 8,3
8) 00:60:dd:47:b3:5a ravi2:0 6,3
9) 00:60:dd:47:ad:5f ravi2:1 8,3
10) 00:60:dd:47:b3:bf ravi1:0 8,3
======
*indus4*
======
MX Version: 1.1.7rc3cvs1_1_fixes
MX Build: @indus4:/opt/mx2g-1.1.7rc3 Thu May 31 11:36:59 PKT 2007
2 Myrinet boards installed.
The MX driver is configured to support up to 4 instances and 1024 nodes.
===================================================================
Instance #0: 333.2 MHz LANai, 66.7 MHz PCI bus, 2 MB SRAM
Status: Running, P0: Link up
MAC Address: 00:60:dd:47:ad:68
Product code: M3F-PCIXF-2
Part number: 09-03392
Serial number: 297238
Mapper: 00:60:dd:47:b3:e8, version = 0x7677b8ba, configured
Mapped hosts: 10
ROUTE
COUNT
INDEX MAC ADDRESS HOST NAME P0
----- ----------- --------- ---
0) 00:60:dd:47:ad:68 indus4:0 1,1
1) 00:60:dd:47:b3:e8 indus4:1 7,3
2) 00:60:dd:47:ad:77 jhelum1:0 7,3
3) 00:60:dd:47:ad:66 indus3:0 1,1
4) 00:60:dd:47:ad:76 indus3:1 7,3
5) 00:60:dd:47:b3:ab indus2:0 1,1
7) 00:60:dd:47:ad:7c indus1:0 7,3
8) 00:60:dd:47:b3:5a ravi2:0 7,3
9) 00:60:dd:47:ad:5f ravi2:1 8,3
10) 00:60:dd:47:b3:bf ravi1:0 7,3
===================================================================
Instance #1: 333.2 MHz LANai, 66.7 MHz PCI bus, 2 MB SRAM
Status: Running, P0: Link up
MAC Address: 00:60:dd:47:b3:e8
Product code: M3F-PCIXF-2
Part number: 09-03392
Serial number: 296575
Mapper: 00:60:dd:47:b3:e8, version = 0x7677b8ba, configured
Mapped hosts: 10
ROUTE
COUNT
INDEX MAC ADDRESS HOST NAME P0
----- ----------- --------- ---
0) 00:60:dd:47:ad:68 indus4:0 6,3
1) 00:60:dd:47:b3:e8 indus4:1 1,1
2) 00:60:dd:47:ad:77 jhelum1:0 1,1
3) 00:60:dd:47:ad:66 indus3:0 8,3
4) 00:60:dd:47:ad:76 indus3:1 1,1
5) 00:60:dd:47:b3:ab indus2:0 8,3
7) 00:60:dd:47:ad:7c indus1:0 7,3
8) 00:60:dd:47:b3:5a ravi2:0 6,3
9) 00:60:dd:47:ad:5f ravi2:1 8,3
10) 00:60:dd:47:b3:bf ravi1:0 8,3
The output from *ompi_info* is:
Open MPI: 1.2.1r14096-ct7b030r1838
Open MPI SVN revision: 0
Open RTE: 1.2.1r14096-ct7b030r1838
Open RTE SVN revision: 0
OPAL: 1.2.1r14096-ct7b030r1838
OPAL SVN revision: 0
Prefix: /opt/SUNWhpc/HPC7.0
Configured architecture: sparc-sun-solaris2.10
Configured by: root
Configured on: Fri Mar 30 12:49:36 EDT 2007
Configure host: burpen-on10-0
Built by: root
Built on: Fri Mar 30 13:10:46 EDT 2007
Built host: burpen-on10-0
C bindings: yes
C++ bindings: yes
Fortran77 bindings: yes (all)
Fortran90 bindings: yes
Fortran90 bindings size: trivial
C compiler: cc
C compiler absolute: /ws/ompi-tools/SUNWspro/SOS11/bin/cc
C++ compiler: CC
C++ compiler absolute: /ws/ompi-tools/SUNWspro/SOS11/bin/CC
Fortran77 compiler: f77
Fortran77 compiler abs: /ws/ompi-tools/SUNWspro/SOS11/bin/f77
Fortran90 compiler: f95
Fortran90 compiler abs: /ws/ompi-tools/SUNWspro/SOS11/bin/f95
C profiling: yes
C++ profiling: yes
Fortran77 profiling: yes
Fortran90 profiling: yes
C++ exceptions: yes
Thread support: no
Internal debug support: no
MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
libltdl support: yes
Heterogeneous support: yes
mpirun default --prefix: yes
MCA backtrace: printstack (MCA v1.0, API v1.0, Component
v1.2.1)
MCA paffinity: solaris (MCA v1.0, API v1.0, Component v1.2.1)
MCA maffinity: first_use (MCA v1.0, API v1.0, Component
v1.2.1)
MCA timer: solaris (MCA v1.0, API v1.0, Component v1.2.1)
MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
MCA coll: basic (MCA v1.0, API v1.0, Component v1.2.1)
MCA coll: self (MCA v1.0, API v1.0, Component v1.2.1)
MCA coll: sm (MCA v1.0, API v1.0, Component v1.2.1)
MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2.1)
MCA io: romio (MCA v1.0, API v1.0, Component v1.2.1)
MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2.1)
MCA mpool: udapl (MCA v1.0, API v1.0, Component v1.2.1)
MCA pml: cm (MCA v1.0, API v1.0, Component v1.2.1)
MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2.1)
MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2.1)
MCA rcache: rb (MCA v1.0, API v1.0, Component v1.2.1)
MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2.1)
MCA btl: mx (MCA v1.0, API v1.0.1, Component v1.2.1)
MCA btl: self (MCA v1.0, API v1.0.1, Component v1.2.1)
MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2.1)
MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0)
MCA btl: udapl (MCA v1.0, API v1.0, Component v1.2.1)
MCA mtl: mx (MCA v1.0, API v1.0, Component v1.2.1)
MCA topo: unity (MCA v1.0, API v1.0, Component v1.2.1)
MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2.1)
MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2.1)
MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2.1)
MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2.1)
MCA gpr: null (MCA v1.0, API v1.0, Component v1.2.1)
MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.2.1)
MCA gpr: replica (MCA v1.0, API v1.0, Component v1.2.1)
MCA iof: proxy (MCA v1.0, API v1.0, Component v1.2.1)
MCA iof: svc (MCA v1.0, API v1.0, Component v1.2.1)
MCA ns: proxy (MCA v1.0, API v2.0, Component v1.2.1)
MCA ns: replica (MCA v1.0, API v2.0, Component v1.2.1)
MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
MCA ras: dash_host (MCA v1.0, API v1.3, Component
v1.2.1)
MCA ras: gridengine (MCA v1.0, API v1.3, Component
v1.2.1)
MCA ras: localhost (MCA v1.0, API v1.3, Component
v1.2.1)
MCA ras: tm (MCA v1.0, API v1.3, Component v1.2.1)
MCA rds: hostfile (MCA v1.0, API v1.3, Component v1.2.1)
MCA rds: proxy (MCA v1.0, API v1.3, Component v1.2.1)
MCA rds: resfile (MCA v1.0, API v1.3, Component v1.2.1)
MCA rmaps: round_robin (MCA v1.0, API v1.3, Component
v1.2.1)
MCA rmgr: proxy (MCA v1.0, API v2.0, Component v1.2.1)
MCA rmgr: urm (MCA v1.0, API v2.0, Component v1.2.1)
MCA rml: oob (MCA v1.0, API v1.0, Component v1.2.1)
MCA pls: gridengine (MCA v1.0, API v1.3, Component
v1.2.1)
MCA pls: proxy (MCA v1.0, API v1.3, Component v1.2.1)
MCA pls: rsh (MCA v1.0, API v1.3, Component v1.2.1)
MCA pls: tm (MCA v1.0, API v1.3, Component v1.2.1)
MCA sds: env (MCA v1.0, API v1.0, Component v1.2.1)
MCA sds: pipe (MCA v1.0, API v1.0, Component v1.2.1)
MCA sds: seed (MCA v1.0, API v1.0, Component v1.2.1)
MCA sds: singleton (MCA v1.0, API v1.0, Component
v1.2.1)
When I try to run a simple hello world program by issuing following
command:
*mpirun -np 4 -mca btl mx,sm,self -machinefile machines ./hello
*The following error appears:
--------------------------------------------------------------------------
Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or
environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
PML add procs failed
--> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
--------------------------------------------------------------------------
Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or
environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
PML add procs failed
--> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
--------------------------------------------------------------------------
Process 0.1.3 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or
environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
PML add procs failed
--> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Process 0.1.2 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or
environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
PML add procs failed
--> Returned "Unreachable" (-*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
The output from more */var/run/fms/fma.log*
Sat Sep 22 10:47:50 2007 NIC 0: M3F-PCIXF-2 s/n=297218 1 ports, speed=2G
Sat Sep 22 10:47:50 2007 mac = 00:60:dd:47:ad:7c
Sat Sep 22 10:47:50 2007 NIC 1: M3F-PCIXF-2 s/n=297248 1 ports, speed=2G
Sat Sep 22 10:47:50 2007 mac = 00:60:dd:47:ad:5e
Sat Sep 22 10:47:50 2007 fms-1.2.1 fma starting
Sat Sep 22 10:47:50 2007 Mapper was 00:00:00:00:00:00, l=0, is now
00:60:dd:47:ad:7c, l=1
Sat Sep 22 10:47:50 2007 Mapping fabric...
Sat Sep 22 10:47:54 2007 Mapper was 00:60:dd:47:ad:7c, l=1, is now
00:60:dd:47:b3:e8, l=1
Sat Sep 22 10:47:54 2007 Cancelling mapping
Sat Sep 22 10:47:59 2007 5 hosts, 8 nics, 6 xbars, 40 links
Sat Sep 22 10:47:59 2007 map version is 1987557551
Sat Sep 22 10:47:59 2007 Found NIC 0 at index 3!
Sat Sep 22 10:47:59 2007 Found NIC 1 at index 2!
Sat Sep 22 10:47:59 2007 map seems OK
Sat Sep 22 10:47:59 2007 Routing took 0 seconds
Mon Sep 24 14:26:46 2007 Requesting remap from indus4
(00:60:dd:47:b3:e8): scouted by 00:60:dd:47:b3:5a, lev=1, pkt_type=0
Mon Sep 24 14:26:51 2007 6 hosts, 10 nics, 6 xbars, 42 links
Mon Sep 24 14:26:51 2007 map version is 1987557552
Mon Sep 24 14:26:51 2007 Found NIC 0 at index 3!
Mon Sep 24 14:26:51 2007 Found NIC 1 at index 2!
Mon Sep 24 14:26:51 2007 map seems OK
Mon Sep 24 14:26:51 2007 Routing took 0 seconds
Mon Sep 24 14:35:17 2007 Requesting remap from indus4
(00:60:dd:47:b3:e8): scouted by 00:60:dd:47:b3:bf, lev=1, pkt_type=0
Mon Sep 24 14:35:19 2007 7 hosts, 11 nics, 6 xbars, 43 links
Mon Sep 24 14:35:19 2007 map version is 1987557553
Mon Sep 24 14:35:19 2007 Found NIC 0 at index 5!
Mon Sep 24 14:35:19 2007 Found NIC 1 at index 4!
Mon Sep 24 14:35:19 2007 map seems OK
Mon Sep 24 14:35:19 2007 Routing took 0 seconds
Tue Sep 25 21:47:52 2007 6 hosts, 9 nics, 6 xbars, 41 links
Tue Sep 25 21:47:52 2007 map version is 1987557554
Tue Sep 25 21:47:52 2007 Found NIC 0 at index 3!
Tue Sep 25 21:47:52 2007 Found NIC 1 at index 2!
Tue Sep 25 21:47:52 2007 map seems OK
Tue Sep 25 21:47:52 2007 Routing took 0 seconds
Tue Sep 25 21:52:02 2007 Requesting remap from indus4
(00:60:dd:47:b3:e8): empty port x0p15 is no longer empty
Tue Sep 25 21:52:07 2007 6 hosts, 10 nics, 6 xbars, 42 links
Tue Sep 25 21:52:07 2007 map version is 1987557555
Tue Sep 25 21:52:07 2007 Found NIC 0 at index 4!
Tue Sep 25 21:52:07 2007 Found NIC 1 at index 3!
Tue Sep 25 21:52:07 2007 map seems OK
Tue Sep 25 21:52:07 2007 Routing took 0 seconds
Tue Sep 25 21:52:23 2007 7 hosts, 11 nics, 6 xbars, 43 links
Tue Sep 25 21:52:23 2007 map version is 1987557556
Tue Sep 25 21:52:23 2007 Found NIC 0 at index 6!
Tue Sep 25 21:52:23 2007 Found NIC 1 at index 5!
Tue Sep 25 21:52:23 2007 map seems OK
Tue Sep 25 21:52:23 2007 Routing took 0 seconds
Wed Sep 26 05:07:01 2007 Requesting remap from indus4
(00:60:dd:47:b3:e8): verify failed x1p2, nic 0, port 0 route=-9 4 10
reply=-10 -4 9 , remote=ravi2 NIC
1, p0 mac=00:60:dd:47:ad:5f
Wed Sep 26 05:07:06 2007 6 hosts, 9 nics, 6 xbars, 41 links
Wed Sep 26 05:07:06 2007 map version is 1987557557
Wed Sep 26 05:07:06 2007 Found NIC 0 at index 3!
Wed Sep 26 05:07:06 2007 Found NIC 1 at index 2!
Wed Sep 26 05:07:06 2007 map seems OK
Wed Sep 26 05:07:06 2007 Routing took 0 seconds
Wed Sep 26 05:11:19 2007 7 hosts, 11 nics, 6 xbars, 43 links
Wed Sep 26 05:11:19 2007 map version is 1987557558
Wed Sep 26 05:11:19 2007 Found NIC 0 at index 3!
Wed Sep 26 05:11:19 2007 Found NIC 1 at index 2!
Wed Sep 26 05:11:19 2007 map seems OK
Wed Sep 26 05:11:19 2007 Routing took 0 seconds
Thu Sep 27 11:45:37 2007 6 hosts, 9 nics, 6 xbars, 41 links
Thu Sep 27 11:45:37 2007 map version is 1987557559
Thu Sep 27 11:45:37 2007 Found NIC 0 at index 6!
Thu Sep 27 11:45:37 2007 Found NIC 1 at index 5!
Thu Sep 27 11:45:37 2007 map seems OK
Thu Sep 27 11:45:37 2007 Routing took 0 seconds
Thu Sep 27 11:51:02 2007 7 hosts, 11 nics, 6 xbars, 43 links
Thu Sep 27 11:51:02 2007 map version is 1987557560
Thu Sep 27 11:51:02 2007 Found NIC 0 at index 6!
Thu Sep 27 11:51:02 2007 Found NIC 1 at index 5!
Thu Sep 27 11:51:02 2007 map seems OK
Thu Sep 27 11:51:02 2007 Routing took 0 seconds
Fri Sep 28 13:27:10 2007 Requesting remap from indus4
(00:60:dd:47:b3:e8): verify failed x5p0, nic 1, port 0 route=-8 15 6
reply=-6 -15 8 , remote=ravi1 NIC
0, p0 mac=00:60:dd:47:b3:bf
Fri Sep 28 13:27:24 2007 6 hosts, 8 nics, 6 xbars, 40 links
Fri Sep 28 13:27:24 2007 map version is 1987557561
Fri Sep 28 13:27:24 2007 Found NIC 0 at index 5!
Fri Sep 28 13:27:24 2007 Cannot find NIC 1 (00:60:dd:47:ad:5e) in map!
Fri Sep 28 13:27:24 2007 map seems OK
Fri Sep 28 13:27:24 2007 Routing took 0 seconds
Fri Sep 28 13:27:44 2007 7 hosts, 10 nics, 6 xbars, 42 links
Fri Sep 28 13:27:44 2007 map version is 1987557562
Fri Sep 28 13:27:44 2007 Found NIC 0 at index 7!
Fri Sep 28 13:27:44 2007 Cannot find NIC 1 (00:60:dd:47:ad:5e) in map!
Fri Sep 28 13:27:44 2007 map seems OK
Fri Sep 28 13:27:44 2007 Routing took 0 seconds
Do you have any suggestion or comments why this error appear and whats
the solution to this problem. I have checked community mailing list for
this problem and found few topics related to this, but could find any
solution. Any suggestion or comments will be highly appreciated.
The code that i m trying to run is given as follows:
#include <stdio.h>
#include "mpi.h"
int main(int argc, char **argv)
{
int rank, size, tag, rc, i;
MPI_Status status;
char message[20];
rc = MPI_Init(&argc, &argv);
rc = MPI_Comm_size(MPI_COMM_WORLD, &size);
rc = MPI_Comm_rank(MPI_COMM_WORLD, &rank);
tag = 100;
if(rank == 0) {
strcpy(message, "Hello, world");
for (i=1; i<size; i++)
rc = MPI_Send(message, 13, MPI_CHAR, i, tag, MPI_COMM_WORLD);
}
else
rc = MPI_Recv(message, 13, MPI_CHAR, 0, tag, MPI_COMM_WORLD,
&status);
printf( "node %d : %.13s\n", rank,message);
rc = MPI_Finalize();
return 0;
}
Thanks.
Looking forward.
Best regards,
Hammad Siddiqi
Center for High Performance Scientific Computing
NUST Institute of Information Technology,
National University of Sciences and Technology,
Rawalpindi, Pakistan.