[OMPI users] job aborts "readv failed: Connection reset by peer"

2016-08-30 Thread Mahmood Naderan
Hi,
An MPI job is running on two nodes and everything seems to be fine.
However, in the middle of the run, the program aborts with the following
error


[compute-0-1.local][[47664,1],14][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
[compute-0-3.local][[47664,1],11][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
[compute-0-3.local][[47664,1],13][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
--
mpirun noticed that process rank 0 with PID 4989 on node compute-0-1 exited
on signal 4 (Illegal instruction).
--


There are 8 processes on that node and each consumes about 150MB of memory.
The total memory usage is about 1% of the memory.

There are some discussions on the web about memory error but there is no
clear answer for that. What does that illegal instruction mean?




Regards,
Mahmood
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] job aborts "readv failed: Connection reset by peer"

2016-08-30 Thread Gilles Gouaillardet
In absence of a clear error message, the btl_tcp_frag related error
messages can suggest a process was killed by the oom-killer.
This is not your case, since rank 0 died because of an illegal instruction.

Are you running under a batch manager ?
On which architecture ?
do your compute node have the very same architecture than the node used to
compile your libs and apps ?
That kind of error can occur if your app was built with AVX2 instructions
(e.g. latest Intel xeon) but runs on a previous generation processor that
is not AVX2 capable.
I guess the same thing can occur if different arm versions are involved.

can you
ulimit -c unlimited
and mpirun again ?
Hopefully you will get a core file that points you to the illegal
instruction



Cheers,

Gilles

On Tuesday, August 30, 2016, Mahmood Naderan  wrote:

> Hi,
> An MPI job is running on two nodes and everything seems to be fine.
> However, in the middle of the run, the program aborts with the following
> error
>
>
> [compute-0-1.local][[47664,1],14][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv]
> mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
> [compute-0-3.local][[47664,1],11][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv]
> mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
> [compute-0-3.local][[47664,1],13][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv]
> mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
> --
> mpirun noticed that process rank 0 with PID 4989 on node compute-0-1
> exited on signal 4 (Illegal instruction).
> --
>
>
> There are 8 processes on that node and each consumes about 150MB of
> memory. The total memory usage is about 1% of the memory.
>
> There are some discussions on the web about memory error but there is no
> clear answer for that. What does that illegal instruction mean?
>
>
>
>
> Regards,
> Mahmood
>
>
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] stdin issue with openmpi/2.0.0

2016-08-30 Thread Jingchao Zhang
I checked again and as far as I can tell, everything was setup correctly. I 
added "HCC debug" to the output message to make sure it's the correct plugin.


The updated outputs:
$ mpirun ./a.out < test.in
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 35 
for process [[26513,1],0]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 41 
for process [[26513,1],0]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 43 
for process [[26513,1],0]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 37 
for process [[26513,1],1]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 46 
for process [[26513,1],1]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 49 
for process [[26513,1],1]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 38 
for process [[26513,1],2]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 50 
for process [[26513,1],2]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 52 
for process [[26513,1],2]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 42 
for process [[26513,1],3]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 53 
for process [[26513,1],3]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 55 
for process [[26513,1],3]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 45 
for process [[26513,1],4]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 56 
for process [[26513,1],4]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 58 
for process [[26513,1],4]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 47 
for process [[26513,1],5]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 59 
for process [[26513,1],5]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 61 
for process [[26513,1],5]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 51 
for process [[26513,1],6]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 62 
for process [[26513,1],6]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 64 
for process [[26513,1],6]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 57 
for process [[26513,1],7]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 66 
for process [[26513,1],7]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 68 
for process [[26513,1],7]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 63 
for process [[26513,1],8]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 70 
for process [[26513,1],8]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 72 
for process [[26513,1],8]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 67 
for process [[26513,1],9]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 74 
for process [[26513,1],9]
[c1725.crane.hcc.unl.edu:218844] HCC debug: [[26513,0],0] iof:hnp pushing fd 76 
for process [[26513,1],9]
Rank 1 has cleared MPI_Init
Rank 3 has cleared MPI_Init
Rank 4 has cleared MPI_Init
Rank 5 has cleared MPI_Init
Rank 6 has cleared MPI_Init
Rank 7 has cleared MPI_Init
Rank 0 has cleared MPI_Init
Rank 2 has cleared MPI_Init
Rank 8 has cleared MPI_Init
Rank 9 has cleared MPI_Init
Rank 10 has cleared MPI_Init
Rank 11 has cleared MPI_Init
Rank 12 has cleared MPI_Init
Rank 13 has cleared MPI_Init
Rank 16 has cleared MPI_Init
Rank 17 has cleared MPI_Init
Rank 18 has cleared MPI_Init
Rank 14 has cleared MPI_Init
Rank 15 has cleared MPI_Init
Rank 19 has cleared MPI_Init



The part of code I changed in file ./orte/mca/iof/hnp/iof_hnp.c


opal_output(0,
 "HCC debug: %s iof:hnp pushing fd %d for process %s",
 ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
 fd, ORTE_NAME_PRINT(dst_name));

/* don't do this if the dst vpid is invalid or the fd is negative! */
if (ORTE_VPID_INVALID == dst_name->vpid || fd < 0) {
return ORTE_SUCCESS;
}

/*OPAL_OUTPUT_VERBOSE((1, orte_iof_base_framework.framework_output,
 "%s iof:hnp pushing fd %d for process %s",
 ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
 fd, ORTE_NAME_PRINT(dst_name)));
*/



From: users  on behalf of r...@open-mpi.org 

Sent: Monday, August 29, 2016 11:42:00 AM
To: Open MPI Users
Subject: Re: [OMPI users] stdin issue with openmpi/2.0.0

I’m sorry, but something is simply very wrong here. Are you sure you are 
pointed at the correct LD

Re: [OMPI users] stdin issue with openmpi/2.0.0

2016-08-30 Thread r...@open-mpi.org
Hmmm...well, the problem appears to be that we aren’t setting up the input 
channel to read stdin. This happens immediately after the application is 
launched - there is no “if” clause or anything else in front of it. The only 
way it wouldn’t get called is if all the procs weren’t launched, but that 
appears to be happening, yes?

Hence my confusion - there is no test in front of that print statement now, and 
yet we aren’t seeing the code being called.

Could you please add “-mca plm_base_verbose 5” to your cmd line? We should see 
a debug statement print that contains "plm:base:launch wiring up iof for job”



> On Aug 30, 2016, at 11:40 AM, Jingchao Zhang  wrote:
> 
> I checked again and as far as I can tell, everything was setup correctly. I 
> added "HCC debug" to the output message to make sure it's the correct plugin. 
> 
> The updated outputs:
> $ mpirun ./a.out < test.in
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 35 for process [[26513,1],0]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 41 for process [[26513,1],0]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 43 for process [[26513,1],0]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 37 for process [[26513,1],1]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 46 for process [[26513,1],1]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 49 for process [[26513,1],1]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 38 for process [[26513,1],2]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 50 for process [[26513,1],2]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 52 for process [[26513,1],2]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 42 for process [[26513,1],3]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 53 for process [[26513,1],3]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 55 for process [[26513,1],3]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 45 for process [[26513,1],4]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 56 for process [[26513,1],4]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 58 for process [[26513,1],4]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 47 for process [[26513,1],5]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 59 for process [[26513,1],5]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 61 for process [[26513,1],5]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 51 for process [[26513,1],6]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 62 for process [[26513,1],6]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 64 for process [[26513,1],6]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 57 for process [[26513,1],7]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 66 for process [[26513,1],7]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 68 for process [[26513,1],7]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 63 for process [[26513,1],8]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 70 for process [[26513,1],8]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 72 for process [[26513,1],8]
> [c1725.crane.hcc.unl.edu :218844] HCC debug: 
> [[26513,0],0] iof:hnp pushing fd 67 for process [[26513,1],9]
> [c1725.crane.hcc.unl.edu 

[OMPI users] Certain files for mpi missing when building mpi4py

2016-08-30 Thread Mahdi, Sam
HI everyone,

I am using a linux fedora. I downloaded/installed
openmpi-1.7.3-1.fc20(64-bit) and openmpi-devel-1.7.3-1.fc20(64-bit). As
well as pypar-openmpi-2.1.5_108-3.fc20(64-bit) and
python3-mpi4py-openmpi-1.3.1-1.fc20(64-bit). The problem I am having is
building mpi4py using the mpicc wrapper. I have installed and untarred
mpi4py from https://pypi.python.org/pypi/mpi4py#downloads. I went to
compile it and received this error. I typed in
python setup.py build --mpicc=/usr/lib64/mpich/bin/mpicc
This was the output
running build
running build_src
running build_py
running build_clib
MPI configuration: [mpi] from 'mpi.cfg'
MPI C compiler:/usr/lib64/mpich/bin/mpicc
running build_ext
MPI configuration: [mpi] from 'mpi.cfg'
MPI C compiler:/usr/lib64/mpich/bin/mpicc
checking for MPI compile and link ...
/usr/lib64/mpich/bin/mpicc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong
--param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic
-D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong
--param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic
-D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python2.7 -c _configtest.c
-o _configtest.o
_configtest.c:2:17: fatal error: mpi.h: No such file or directory
 #include 
 ^
compilation terminated.
failure.
removing: _configtest.c _configtest.o
error: Cannot compile MPI programs. Check your configuration!!!

I found the file mpi.h and decided to directly export it to the path.
export
path=$path:/usr/lib64/python2.7/site-packages/mpich/mpi4py/include/mpi4py/mpi4py.h
But this did not resolve the include mpi.h dilemma. I still recieve the
same error when attempting to build mpi4py using mpicc wrapper.

Thank you in advance
-Sam
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] stdin issue with openmpi/2.0.0

2016-08-30 Thread Jingchao Zhang
Yes, all procs were launched properly. I added “-mca plm_base_verbose 5” to the 
mpirun command. Please see attached for the results.


$mpirun -mca plm_base_verbose 5 ./a.out < test.in


I mentioned in my initial post that the test job can run properly for the 1st 
time. But if I kill the job and resubmit, then it hangs. It happened with the 
job above as well. Very odd.


From: users  on behalf of r...@open-mpi.org 

Sent: Tuesday, August 30, 2016 12:56:33 PM
To: Open MPI Users
Subject: Re: [OMPI users] stdin issue with openmpi/2.0.0

Hmmm...well, the problem appears to be that we aren’t setting up the input 
channel to read stdin. This happens immediately after the application is 
launched - there is no “if” clause or anything else in front of it. The only 
way it wouldn’t get called is if all the procs weren’t launched, but that 
appears to be happening, yes?

Hence my confusion - there is no test in front of that print statement now, and 
yet we aren’t seeing the code being called.

Could you please add “-mca plm_base_verbose 5” to your cmd line? We should see 
a debug statement print that contains "plm:base:launch wiring up iof for job”



On Aug 30, 2016, at 11:40 AM, Jingchao Zhang 
mailto:zh...@unl.edu>> wrote:

I checked again and as far as I can tell, everything was setup correctly. I 
added "HCC debug" to the output message to make sure it's the correct plugin.


The updated outputs:
$ mpirun ./a.out < test.in
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 35 for process [[26513,1],0]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 41 for process [[26513,1],0]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 43 for process [[26513,1],0]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 37 for process [[26513,1],1]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 46 for process [[26513,1],1]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 49 for process [[26513,1],1]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 38 for process [[26513,1],2]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 50 for process [[26513,1],2]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 52 for process [[26513,1],2]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 42 for process [[26513,1],3]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 53 for process [[26513,1],3]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 55 for process [[26513,1],3]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 45 for process [[26513,1],4]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 56 for process [[26513,1],4]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 58 for process [[26513,1],4]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 47 for process [[26513,1],5]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 59 for process [[26513,1],5]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 61 for process [[26513,1],5]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 51 for process [[26513,1],6]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 62 for process [[26513,1],6]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 64 for process [[26513,1],6]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 57 for process [[26513,1],7]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 66 for process [[26513,1],7]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 68 for process [[26513,1],7]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 63 for process [[26513,1],8]
[c1725.crane.hcc.unl.

Re: [OMPI users] stdin issue with openmpi/2.0.0

2016-08-30 Thread r...@open-mpi.org
Well, that helped a bit. For some reason, your system is skipping a step in the 
launch state machine, and so we never hit the step where we setup the IO 
forwarding system.

Sorry to keep poking, but I haven’t seen this behavior anywhere else, and so I 
have no way to replicate it. Must be a subtle race condition.

Can you replace “plm” with ‘“state” and try to hit a “bad” run again?


> On Aug 30, 2016, at 12:30 PM, Jingchao Zhang  wrote:
> 
> Yes, all procs were launched properly. I added “-mca plm_base_verbose 5” to 
> the mpirun command. Please see attached for the results.
> 
> $mpirun -mca plm_base_verbose 5 ./a.out < test.in
> 
> I mentioned in my initial post that the test job can run properly for the 1st 
> time. But if I kill the job and resubmit, then it hangs. It happened with the 
> job above as well. Very odd. 
> From: users  > on behalf of r...@open-mpi.org 
>  mailto:r...@open-mpi.org>>
> Sent: Tuesday, August 30, 2016 12:56:33 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] stdin issue with openmpi/2.0.0
>  
> Hmmm...well, the problem appears to be that we aren’t setting up the input 
> channel to read stdin. This happens immediately after the application is 
> launched - there is no “if” clause or anything else in front of it. The only 
> way it wouldn’t get called is if all the procs weren’t launched, but that 
> appears to be happening, yes?
> 
> Hence my confusion - there is no test in front of that print statement now, 
> and yet we aren’t seeing the code being called.
> 
> Could you please add “-mca plm_base_verbose 5” to your cmd line? We should 
> see a debug statement print that contains "plm:base:launch wiring up iof for 
> job”
> 
> 
> 
>> On Aug 30, 2016, at 11:40 AM, Jingchao Zhang > > wrote:
>> 
>> I checked again and as far as I can tell, everything was setup correctly. I 
>> added "HCC debug" to the output message to make sure it's the correct 
>> plugin. 
>> 
>> The updated outputs:
>> $ mpirun ./a.out < test.in
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 35 for process [[26513,1],0]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 41 for process [[26513,1],0]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 43 for process [[26513,1],0]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 37 for process [[26513,1],1]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 46 for process [[26513,1],1]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 49 for process [[26513,1],1]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 38 for process [[26513,1],2]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 50 for process [[26513,1],2]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 52 for process [[26513,1],2]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 42 for process [[26513,1],3]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 53 for process [[26513,1],3]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 55 for process [[26513,1],3]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 45 for process [[26513,1],4]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 56 for process [[26513,1],4]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 58 for process [[26513,1],4]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 47 for process [[26513,1],5]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 59 for process [[26513,1],5]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 61 for process [[26513,1],5]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 51 for process [[26513,1],6]
>> [c1725.crane.hcc.unl.edu :218844] HCC 
>> debug: [[26513,0],0] iof:hnp pushing fd 62 for process

Re: [OMPI users] stdin issue with openmpi/2.0.0

2016-08-30 Thread r...@open-mpi.org
Oh my - that indeed illustrated the problem!! It is indeed a race condition on 
the backend orted. I’ll try to fix it - probably have to send you a patch to 
test?

> On Aug 30, 2016, at 1:04 PM, Jingchao Zhang  wrote:
> 
> $mpirun -mca state_base_verbose 5 ./a.out < test.in
> 
> Please see attached for the outputs.
> 
> Thank you Ralph. I am willing to provide whatever information you need.
> 
> From: users  > on behalf of r...@open-mpi.org 
>  mailto:r...@open-mpi.org>>
> Sent: Tuesday, August 30, 2016 1:45:45 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] stdin issue with openmpi/2.0.0
>  
> Well, that helped a bit. For some reason, your system is skipping a step in 
> the launch state machine, and so we never hit the step where we setup the IO 
> forwarding system.
> 
> Sorry to keep poking, but I haven’t seen this behavior anywhere else, and so 
> I have no way to replicate it. Must be a subtle race condition.
> 
> Can you replace “plm” with ‘“state” and try to hit a “bad” run again?
> 
> 
>> On Aug 30, 2016, at 12:30 PM, Jingchao Zhang > > wrote:
>> 
>> Yes, all procs were launched properly. I added “-mca plm_base_verbose 5” to 
>> the mpirun command. Please see attached for the results.
>> 
>> $mpirun -mca plm_base_verbose 5 ./a.out < test.in
>> 
>> I mentioned in my initial post that the test job can run properly for the 
>> 1st time. But if I kill the job and resubmit, then it hangs. It happened 
>> with the job above as well. Very odd. 
>> From: users > > on behalf of r...@open-mpi.org 
>>  mailto:r...@open-mpi.org>>
>> Sent: Tuesday, August 30, 2016 12:56:33 PM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] stdin issue with openmpi/2.0.0
>>  
>> Hmmm...well, the problem appears to be that we aren’t setting up the input 
>> channel to read stdin. This happens immediately after the application is 
>> launched - there is no “if” clause or anything else in front of it. The only 
>> way it wouldn’t get called is if all the procs weren’t launched, but that 
>> appears to be happening, yes?
>> 
>> Hence my confusion - there is no test in front of that print statement now, 
>> and yet we aren’t seeing the code being called.
>> 
>> Could you please add “-mca plm_base_verbose 5” to your cmd line? We should 
>> see a debug statement print that contains "plm:base:launch wiring up iof for 
>> job”
>> 
>> 
>> 
>>> On Aug 30, 2016, at 11:40 AM, Jingchao Zhang >> > wrote:
>>> 
>>> I checked again and as far as I can tell, everything was setup correctly. I 
>>> added "HCC debug" to the output message to make sure it's the correct 
>>> plugin. 
>>> 
>>> The updated outputs:
>>> $ mpirun ./a.out < test.in
>>> [c1725.crane.hcc.unl.edu :218844] HCC 
>>> debug: [[26513,0],0] iof:hnp pushing fd 35 for process [[26513,1],0]
>>> [c1725.crane.hcc.unl.edu :218844] HCC 
>>> debug: [[26513,0],0] iof:hnp pushing fd 41 for process [[26513,1],0]
>>> [c1725.crane.hcc.unl.edu :218844] HCC 
>>> debug: [[26513,0],0] iof:hnp pushing fd 43 for process [[26513,1],0]
>>> [c1725.crane.hcc.unl.edu :218844] HCC 
>>> debug: [[26513,0],0] iof:hnp pushing fd 37 for process [[26513,1],1]
>>> [c1725.crane.hcc.unl.edu :218844] HCC 
>>> debug: [[26513,0],0] iof:hnp pushing fd 46 for process [[26513,1],1]
>>> [c1725.crane.hcc.unl.edu :218844] HCC 
>>> debug: [[26513,0],0] iof:hnp pushing fd 49 for process [[26513,1],1]
>>> [c1725.crane.hcc.unl.edu :218844] HCC 
>>> debug: [[26513,0],0] iof:hnp pushing fd 38 for process [[26513,1],2]
>>> [c1725.crane.hcc.unl.edu :218844] HCC 
>>> debug: [[26513,0],0] iof:hnp pushing fd 50 for process [[26513,1],2]
>>> [c1725.crane.hcc.unl.edu :218844] HCC 
>>> debug: [[26513,0],0] iof:hnp pushing fd 52 for process [[26513,1],2]
>>> [c1725.crane.hcc.unl.edu :218844] HCC 
>>> debug: [[26513,0],0] iof:hnp pushing fd 42 for process [[26513,1],3]
>>> [c1725.crane.hcc.unl.edu :218844] HCC 
>>> debug: [[26513,0],0] iof:hnp pushing fd 53 for process [[26513,1],3]
>>> [c1725.crane.hcc.unl.edu :218844] HCC 
>>> debug: [[26513,0],0] iof:hnp pushing fd 55 for process [[26513,1],3]
>>> [c1725.crane.hcc.unl.edu :218844] HCC 
>>> debug: [[26513,0],0] iof:hnp pushing fd 45 for process [[26513,1],4]
>>> [c1725.crane.hcc.unl.edu :218844] HCC 
>>> debug: [[26513,0],0] iof:hnp pushing fd 56 for process [[26513,1],4]
>>> [c1725.crane.hcc.unl.edu :218844] HCC 
>>> d

[OMPI users] bug? "The system limit on number of children a process can have was reached"

2016-08-30 Thread Jason Maldonis
Hello everyone,

I am using openmpi-1.10.2 and I am using the `spawn_multiple` MPI function
inside a for-loop. My program spawns N workers within each iteration of the
for-loop, makes some changes to the input for the next iteration, and then
proceeds to the next iteration.

After a few iterations (~40), I am getting the following error:

ORTE_ERROR_LOG: The system limit on number of children a process can have
was reached in file odls_default_module.c at line 928

However, I believe I am successfully disconnecting the workers at the end
of each iteration, and I am never creating more than 8 workers at a time. I
am running on a single node with 16 cores.

I received some help from you previously in this
 thread,
where Nathan Hjelm / Ralph Castian found a bug that was leading to a
different error. Ralph found a work-around for now by telling me to add
"-mca btl tcp,sm,self" to the mpirun cmd line.  I also use the
"-oversubscribe" option. My full executable line looks like this:

mpiexec -np 16 -oversubscribe -mca btl tcp,sm,self python
../../structopt/genetic.py genetic.in.json

I use mpi4py which is why python is run as the executable.

I am hoping someone might know why I am getting the "system limit on number
of children" error or if someone has had a similar error in the past. I
couldn't find anything on Google.

Please let me know if I can give you additional information to help.

Thank you,
Jason
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] stdin issue with openmpi/2.0.0

2016-08-30 Thread Jingchao Zhang
Yes, I can definitely help to test the patch.


Jingchao

From: users  on behalf of r...@open-mpi.org 

Sent: Tuesday, August 30, 2016 2:23:12 PM
To: Open MPI Users
Subject: Re: [OMPI users] stdin issue with openmpi/2.0.0

Oh my - that indeed illustrated the problem!! It is indeed a race condition on 
the backend orted. I’ll try to fix it - probably have to send you a patch to 
test?

On Aug 30, 2016, at 1:04 PM, Jingchao Zhang 
mailto:zh...@unl.edu>> wrote:

$mpirun -mca state_base_verbose 5 ./a.out < test.in

Please see attached for the outputs.

Thank you Ralph. I am willing to provide whatever information you need.


From: users 
mailto:users-boun...@lists.open-mpi.org>> on 
behalf of r...@open-mpi.org 
mailto:r...@open-mpi.org>>
Sent: Tuesday, August 30, 2016 1:45:45 PM
To: Open MPI Users
Subject: Re: [OMPI users] stdin issue with openmpi/2.0.0

Well, that helped a bit. For some reason, your system is skipping a step in the 
launch state machine, and so we never hit the step where we setup the IO 
forwarding system.

Sorry to keep poking, but I haven’t seen this behavior anywhere else, and so I 
have no way to replicate it. Must be a subtle race condition.

Can you replace “plm” with ‘“state” and try to hit a “bad” run again?


On Aug 30, 2016, at 12:30 PM, Jingchao Zhang 
mailto:zh...@unl.edu>> wrote:

Yes, all procs were launched properly. I added “-mca plm_base_verbose 5” to the 
mpirun command. Please see attached for the results.

$mpirun -mca plm_base_verbose 5 ./a.out < test.in

I mentioned in my initial post that the test job can run properly for the 1st 
time. But if I kill the job and resubmit, then it hangs. It happened with the 
job above as well. Very odd.

From: users 
mailto:users-boun...@lists.open-mpi.org>> on 
behalf of r...@open-mpi.org 
mailto:r...@open-mpi.org>>
Sent: Tuesday, August 30, 2016 12:56:33 PM
To: Open MPI Users
Subject: Re: [OMPI users] stdin issue with openmpi/2.0.0

Hmmm...well, the problem appears to be that we aren’t setting up the input 
channel to read stdin. This happens immediately after the application is 
launched - there is no “if” clause or anything else in front of it. The only 
way it wouldn’t get called is if all the procs weren’t launched, but that 
appears to be happening, yes?

Hence my confusion - there is no test in front of that print statement now, and 
yet we aren’t seeing the code being called.

Could you please add “-mca plm_base_verbose 5” to your cmd line? We should see 
a debug statement print that contains "plm:base:launch wiring up iof for job”



On Aug 30, 2016, at 11:40 AM, Jingchao Zhang 
mailto:zh...@unl.edu>> wrote:

I checked again and as far as I can tell, everything was setup correctly. I 
added "HCC debug" to the output message to make sure it's the correct plugin.


The updated outputs:
$ mpirun ./a.out < test.in
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 35 for process [[26513,1],0]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 41 for process [[26513,1],0]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 43 for process [[26513,1],0]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 37 for process [[26513,1],1]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 46 for process [[26513,1],1]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 49 for process [[26513,1],1]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 38 for process [[26513,1],2]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 50 for process [[26513,1],2]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 52 for process [[26513,1],2]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 42 for process [[26513,1],3]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 53 for process [[26513,1],3]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 55 for process [[26513,1],3]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 45 for process [[26513,1],4]
[c1725.crane.hcc.unl.edu:218844] HCC debug: 
[[26513,0],0] iof:hnp pushing fd 56 for process [[26513,1],4]
[c1725.crane.hcc.unl.edu

Re: [OMPI users] stdin issue with openmpi/2.0.0

2016-08-30 Thread Jingchao Zhang
Thank you! The patch fixed the problem. I did multiple tests with your program 
and another application. No more process hangs!


Cheers,


Dr. Jingchao Zhang
Holland Computing Center
University of Nebraska-Lincoln
402-472-6400

From: users  on behalf of r...@open-mpi.org 

Sent: Tuesday, August 30, 2016 6:37:51 PM
To: Open MPI Users
Subject: Re: [OMPI users] stdin issue with openmpi/2.0.0

Sorry - previous version had a typo in it:

diff --git a/orte/mca/state/orted/state_orted.c 
b/orte/mca/state/orted/state_orted.c
index e506cc9..eb0c4b4 100644
--- a/orte/mca/state/orted/state_orted.c
+++ b/orte/mca/state/orted/state_orted.c
@@ -156,7 +156,10 @@ static void track_jobs(int fd, short argc, void *cbdata)
 orte_state_caddy_t *caddy = (orte_state_caddy_t*)cbdata;
 opal_buffer_t *alert;
 orte_plm_cmd_flag_t cmd;
-int rc;
+int rc, i;
+orte_proc_state_t running = ORTE_PROC_STATE_RUNNING;
+orte_proc_t *child;
+orte_vpid_t null=ORTE_VPID_INVALID;

 if (ORTE_JOB_STATE_LOCAL_LAUNCH_COMPLETE == caddy->job_state) {
 OPAL_OUTPUT_VERBOSE((5, orte_state_base_framework.framework_output,
@@ -172,12 +175,52 @@ static void track_jobs(int fd, short argc, void *cbdata)
 OBJ_RELEASE(alert);
 goto cleanup;
 }
-/* pack the job info */
-if (ORTE_SUCCESS != (rc = pack_state_update(alert, caddy->jdata))) {
+/* pack the jobid */
+if (ORTE_SUCCESS != (rc = opal_dss.pack(alert, &caddy->jdata->jobid, 
1, ORTE_JOBID))) {
 ORTE_ERROR_LOG(rc);
 OBJ_RELEASE(alert);
 goto cleanup;
 }
+for (i=0; i < orte_local_children->size; i++) {
+if (NULL == (child = 
(orte_proc_t*)opal_pointer_array_get_item(orte_local_children, i))) {
+continue;
+}
+/* if this child is part of the job... */
+if (child->name.jobid == caddy->jdata->jobid) {
+/* pack the child's vpid */
+if (ORTE_SUCCESS != (rc = opal_dss.pack(alert, 
&(child->name.vpid), 1, ORTE_VPID))) {
+ORTE_ERROR_LOG(rc);
+OBJ_RELEASE(alert);
+goto cleanup;
+}
+/* pack the pid */
+if (ORTE_SUCCESS != (rc = opal_dss.pack(alert, &child->pid, 1, 
OPAL_PID))) {
+ORTE_ERROR_LOG(rc);
+OBJ_RELEASE(alert);
+goto cleanup;
+}
+/* pack the RUNNING state */
+if (ORTE_SUCCESS != (rc = opal_dss.pack(alert, &running, 1, 
ORTE_PROC_STATE))) {
+ORTE_ERROR_LOG(rc);
+OBJ_RELEASE(alert);
+goto cleanup;
+}
+/* pack its exit code */
+if (ORTE_SUCCESS != (rc = opal_dss.pack(alert, 
&child->exit_code, 1, ORTE_EXIT_CODE))) {
+ORTE_ERROR_LOG(rc);
+OBJ_RELEASE(alert);
+goto cleanup;
+}
+}
+}
+
+/* flag that this job is complete so the receiver can know */
+if (ORTE_SUCCESS != (rc = opal_dss.pack(alert, &null, 1, ORTE_VPID))) {
+ORTE_ERROR_LOG(rc);
+OBJ_RELEASE(alert);
+goto cleanup;
+}
+
 /* send it */
 if (0 > (rc = orte_rml.send_buffer_nb(ORTE_PROC_MY_HNP, alert,
   ORTE_RML_TAG_PLM,


On Aug 30, 2016, at 1:51 PM, Jingchao Zhang 
mailto:zh...@unl.edu>> wrote:

Yes, I can definitely help to test the patch.

Jingchao

From: users 
mailto:users-boun...@lists.open-mpi.org>> on 
behalf of r...@open-mpi.org 
mailto:r...@open-mpi.org>>
Sent: Tuesday, August 30, 2016 2:23:12 PM
To: Open MPI Users
Subject: Re: [OMPI users] stdin issue with openmpi/2.0.0

Oh my - that indeed illustrated the problem!! It is indeed a race condition on 
the backend orted. I’ll try to fix it - probably have to send you a patch to 
test?

On Aug 30, 2016, at 1:04 PM, Jingchao Zhang 
mailto:zh...@unl.edu>> wrote:

$mpirun -mca state_base_verbose 5 ./a.out < test.in

Please see attached for the outputs.

Thank you Ralph. I am willing to provide whatever information you need.


From: users 
mailto:users-boun...@lists.open-mpi.org>> on 
behalf of r...@open-mpi.org 
mailto:r...@open-mpi.org>>
Sent: Tuesday, August 30, 2016 1:45:45 PM
To: Open MPI Users
Subject: Re: [OMPI users] stdin issue with openmpi/2.0.0

Well, that helped a bit. For some reason, your system is skipping a step in the 
launch state machine, and so we never hit the step where we setup the IO 
forwarding system.

Sorry to keep poking, but I haven’t seen this behavior anywhere else, and so I 
have no way to replicate it. Must be a subtle race condition.

Can you replac

Re: [OMPI users] Certain files for mpi missing when building mpi4py

2016-08-30 Thread Gilles Gouaillardet

Sam,

at first you mentionned Open MPI 1.7.3.

though this is now a legacy version, you posted to the right place.


then you

# python setup.py build --mpicc=/usr/lib64/mpich/bin/mpicc


this is mpich, which is a very reputable MPI implementation, but not 
Open MPI.


so i do invite you to use Open MPI mpicc, and try again.


Cheers,


Gilles


PS

with Open MPI, you can

mpirun -showme ...

in order to display how the compiler (e.g. gcc) is invoked

with mpich, that would be (there might be a better option i am unaware of)

mpirun -v ...

if mpich mpicc wrapper is not broken, it should include a path to mpi.h

(e.g. -I/usr/lib64/mpich/include)

and unless a package is missing (mpich-devel ?), that file should exist


you cannot use Open MPI mpi.h with mpich, nor the other way around, and 
you should not copy this file to an other place


(that should not be needed at all)

On 8/31/2016 4:22 AM, Mahdi, Sam wrote:

HI everyone,

I am using a linux fedora. I downloaded/installed 
openmpi-1.7.3-1.fc20(64-bit) and openmpi-devel-1.7.3-1.fc20(64-bit). 
As well as pypar-openmpi-2.1.5_108-3.fc20(64-bit) and 
python3-mpi4py-openmpi-1.3.1-1.fc20(64-bit). The problem I am having 
is building mpi4py using the mpicc wrapper. I have installed and 
untarred mpi4py from https://pypi.python.org/pypi/mpi4py#downloads. I 
went to compile it and received this error. I typed in

python setup.py build --mpicc=/usr/lib64/mpich/bin/mpicc
This was the output
running build
running build_src
running build_py
running build_clib
MPI configuration: [mpi] from 'mpi.cfg'
MPI C compiler:/usr/lib64/mpich/bin/mpicc
running build_ext
MPI configuration: [mpi] from 'mpi.cfg'
MPI C compiler:/usr/lib64/mpich/bin/mpicc
checking for MPI compile and link ...
/usr/lib64/mpich/bin/mpicc -pthread -fno-strict-aliasing -O2 -g -pipe 
-Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong 
--param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic 
-D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall 
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong 
--param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic 
-D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python2.7 -c 
_configtest.c -o _configtest.o

_configtest.c:2:17: fatal error: mpi.h: No such file or directory
 #include 
 ^
compilation terminated.
failure.
removing: _configtest.c _configtest.o
error: Cannot compile MPI programs. Check your configuration!!!

I found the file mpi.h and decided to directly export it to the path.
export 
path=$path:/usr/lib64/python2.7/site-packages/mpich/mpi4py/include/mpi4py/mpi4py.h
But this did not resolve the include mpi.h dilemma. I still recieve 
the same error when attempting to build mpi4py using mpicc wrapper.


Thank you in advance
-Sam



___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users