Re: [OMPI users] MPIIO and OrangeFS

2015-02-25 Thread vithanousek
Thanks for your repaly!

I checked my configuration parametrs and it seem, that everything is correct:
./configure --prefix=/opt/modules/openmpi-1.8.4 --with-sge --with-psm 
--with-pvfs2=/opt/orangefs 
--with-io-romio-flags='--with-file-system=pvfs2+ufs+nfs 
--with-pvfs2=/opt/orangefs'

I have added error chceking code to my app, and I was getting multiple errors, 
like en MPI_ERR_AMODE, MPI_ERR_UNKNOWN, MPI_ERR_NO_SUCH_FILE,MPI_ERR_IO. 
(depend on permisions of mount point of pvfs2, and --mca io romio/ompio --mca 
fs pvfs2)

But it seems that error is in sourcecode of my application, because I cant find 
any more complex documentation about using ROMIO and OMPIO. 
I found here https://surfsara.nl/systems/lisa/software/pvfs2, that I should use 
as filename "pvfs2:/pvfs_mount_point/name_of_file" instead of 
"/pvfs_mount_point/name_of_file". This is working with ROMIO.

Do you know how to use OMPIO without mounting pvfs2? if I tryed the same 
filename format as in ROMIO I got "MPI_ERR_FILE: invalid file".
If I use normal filename format ("/mountpoint/filename") and force use of pvfs2 
by using  --mca io ompio --mca fs pvfs2, then my app fails with 
mca_fs_base_file_select() failed (and backtrace).

At OrangeFS documentation (http://docs.orangefs.com/v_2_8_8/index.htm) is 
chapter about using ROMIO, and it says, that i shoud compile apps with -lpvfs2. 
I have tryed it, but nothing change (ROMIO works with special filename format, 
OMPIO doesnt work)

Thanks for your help. If you point me to some usefull documentation, I will be 
happy.
Hanousek Vít


-- Původní zpráva --
Od: Rob Latham 
Komu: us...@open-mpi.org, vithanou...@seznam.cz
Datum: 24. 2. 2015 22:10:08
Předmět: Re: [OMPI users] MPIIO and OrangeFS

On 02/24/2015 02:00 PM, vithanousek wrote:
> Hello,
>
> Im not sure if I have my OrangeFS (2.8.8) and OpenMPI (1.8.4) set up 
> corectly. One short questin?
>
> Is it needed to have OrangeFS  mounted  through kernel module, if I want use 
> MPIIO?

nope!

> My simple MPIIO hello world program doesnt work, If i havent mounted 
> OrangeFS. When I mount OrangeFS, it works. So I'm not sure if OMPIO (or 
> ROMIO) is using pvfs2 servers directly or if it is using kernel module.
>
> Sorry for stupid question, but I didnt find any documentation about it.

http://www.pvfs.org/cvs/pvfs-2-8-branch-docs/doc/pvfs2-quickstart/pvfs2-quickstart.php#sec:romio

It sounds like you have not configured your MPI implementation with 
PVFS2 support (OrangeFS is a re-branding of PVFS2, but as far as MPI-IO 
is concerned, they are the same).

OpenMPI passes flags to romio like this at configure time:

  --with-io-romio-flags="--with-file-system=pvfs2+ufs+nfs"

I'm not sure how OMPIO takes flags.

If pvfs2-ping and pvfs2-cp and pvfs2-ls work, then you can bypass the 
kernel.

also, please check return codes:

http://stackoverflow.com/questions/22859269/what-do-mpi-io-error-codes-mean/26373193#26373193

==rob


> Thanks for replays
> Hanousek Vít
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/02/26382.php
>

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

[OMPI users] openmpi-1.8.4 on OSX, mpirun execution error.

2015-02-25 Thread Javier Mas Solé
I have a fresh install of openmpi-1.8.4 in a  Mac with OSX-10.9.5. It compiled 
and installed fine. 
I have a Fortran code that runs perfectly on another similar machine with 
openmpi-1.6.5. It compiled
without error in  the new Mac. When I want to  mpirun, it gives the following  
message below.

Also if i write echo $PATH  I can spot the combinations us/local/bin which was 
a warning in the installation instructions.
I have read in other forums that this might signal a duplicity of versions 
openmpi. I cannot rule this out, although
I don’t find any duplicate in the use/local/bin folder.

I’m thinking of uninstalling this version and installing the 1.6.5 which works 
fine. 
¿Anyone can tell me how to do this uninstall?

Thanks a lot

Javier

I have seen a similar post to this one 

fpmac114:AdSHW javier$ /usr/local/bin/mpirun sim1.exe
[fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
/usr/local/lib/openmpi/mca_ess_slurmd: 
dlopen(/usr/local/lib/openmpi/mca_ess_slurmd.so, 9): Symbol not found: 
_orte_jmap_t_class
  Referenced from: /usr/local/lib/openmpi/mca_ess_slurmd.so
  Expected in: flat namespace
 in /usr/local/lib/openmpi/mca_ess_slurmd.so (ignored)
[fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
/usr/local/lib/openmpi/mca_errmgr_default: 
dlopen(/usr/local/lib/openmpi/mca_errmgr_default.so, 9): Symbol not found: 
_orte_errmgr_base_error_abort
  Referenced from: /usr/local/lib/openmpi/mca_errmgr_default.so
  Expected in: flat namespace
 in /usr/local/lib/openmpi/mca_errmgr_default.so (ignored)
[fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
/usr/local/lib/openmpi/mca_routed_cm: 
dlopen(/usr/local/lib/openmpi/mca_routed_cm.so, 9): Symbol not found: 
_orte_message_event_t_class
  Referenced from: /usr/local/lib/openmpi/mca_routed_cm.so
  Expected in: flat namespace
 in /usr/local/lib/openmpi/mca_routed_cm.so (ignored)
[fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
/usr/local/lib/openmpi/mca_routed_linear: 
dlopen(/usr/local/lib/openmpi/mca_routed_linear.so, 9): Symbol not found: 
_orte_message_event_t_class
  Referenced from: /usr/local/lib/openmpi/mca_routed_linear.so
  Expected in: flat namespace
 in /usr/local/lib/openmpi/mca_routed_linear.so (ignored)
[fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
/usr/local/lib/openmpi/mca_grpcomm_basic: 
dlopen(/usr/local/lib/openmpi/mca_grpcomm_basic.so, 9): Symbol not found: 
_opal_profile
  Referenced from: /usr/local/lib/openmpi/mca_grpcomm_basic.so
  Expected in: flat namespace
 in /usr/local/lib/openmpi/mca_grpcomm_basic.so (ignored)
[fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
/usr/local/lib/openmpi/mca_grpcomm_hier: 
dlopen(/usr/local/lib/openmpi/mca_grpcomm_hier.so, 9): Symbol not found: 
_orte_daemon_cmd_processor
  Referenced from: /usr/local/lib/openmpi/mca_grpcomm_hier.so
  Expected in: flat namespace
 in /usr/local/lib/openmpi/mca_grpcomm_hier.so (ignored)
[fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
/usr/local/lib/openmpi/mca_filem_rsh: 
dlopen(/usr/local/lib/openmpi/mca_filem_rsh.so, 9): Symbol not found: 
_opal_uses_threads
  Referenced from: /usr/local/lib/openmpi/mca_filem_rsh.so
  Expected in: flat namespace
 in /usr/local/lib/openmpi/mca_filem_rsh.so (ignored)
[fpmac114:00398] *** Process received signal ***
[fpmac114:00398] Signal: Segmentation fault: 11 (11)
[fpmac114:00398] Signal code: Address not mapped (1)
[fpmac114:00398] Failing at address: 0x10013
[fpmac114:00398] [ 0] 0   libsystem_platform.dylib
0x7fff933125aa _sigtramp + 26
[fpmac114:00398] [ 1] 0   ??? 
0x7fff5b7f00ff 0x0 + 140734728438015
[fpmac114:00398] [ 2] 0   libopen-rte.7.dylib 
0x000104469ee5 orte_rmaps_base_map_job + 1525
[fpmac114:00398] [ 3] 0   libopen-pal.6.dylib 
0x0001044e4346 opal_libevent2021_event_base_loop + 2214
[fpmac114:00398] [ 4] 0   mpirun  
0x000104411bc0 orterun + 6320
[fpmac114:00398] [ 5] 0   mpirun  
0x0001044102f2 main + 34
[fpmac114:00398] [ 6] 0   libdyld.dylib   
0x7fff8d08a5fd start + 1
[fpmac114:00398] [ 7] 0   ??? 
0x0002 0x0 + 2
[fpmac114:00398] *** End of error message ***
Segmentation fault: 11



Re: [OMPI users] openmpi-1.8.4 on OSX, mpirun execution error.

2015-02-25 Thread Jeff Squyres (jsquyres)
If you had an older Open MPI installed into /usr/local before you installed 
Open MPI 1.8.4 into /usr/local, it's quite possible that some of the older 
plugins are still there (and will not play nicely with the 1.8.4 install).

Specifically: installing a new Open MPI does not uninstall an older Open MPI.

What you can probably do is

rm -rf /usr/local/lib/openmpi

This will completely delete *all* Open MPI plugins (both new and old) from the 
/usr/local tree.

Then re-install the 1.8.4 again, and see if that works for you.



> On Feb 25, 2015, at 7:52 AM, Javier Mas Solé  
> wrote:
> 
> I have a fresh install of openmpi-1.8.4 in a  Mac with OSX-10.9.5. It 
> compiled and installed fine. 
> I have a Fortran code that runs perfectly on another similar machine with 
> openmpi-1.6.5. It compiled
> without error in  the new Mac. When I want to  mpirun, it gives the following 
>  message below.
> 
> Also if i write echo $PATH  I can spot the combinations us/local/bin which 
> was a warning in the installation instructions.
> I have read in other forums that this might signal a duplicity of versions 
> openmpi. I cannot rule this out, although
> I don’t find any duplicate in the use/local/bin folder.
> 
> I’m thinking of uninstalling this version and installing the 1.6.5 which 
> works fine. 
> ¿Anyone can tell me how to do this uninstall?
> 
> Thanks a lot
> 
> Javier
> 
> I have seen a similar post to this one 
> 
> fpmac114:AdSHW javier$ /usr/local/bin/mpirun sim1.exe
> [fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
> /usr/local/lib/openmpi/mca_ess_slurmd: 
> dlopen(/usr/local/lib/openmpi/mca_ess_slurmd.so, 9): Symbol not found: 
> _orte_jmap_t_class
>   Referenced from: /usr/local/lib/openmpi/mca_ess_slurmd.so
>   Expected in: flat namespace
>  in /usr/local/lib/openmpi/mca_ess_slurmd.so (ignored)
> [fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
> /usr/local/lib/openmpi/mca_errmgr_default: 
> dlopen(/usr/local/lib/openmpi/mca_errmgr_default.so, 9): Symbol not found: 
> _orte_errmgr_base_error_abort
>   Referenced from: /usr/local/lib/openmpi/mca_errmgr_default.so
>   Expected in: flat namespace
>  in /usr/local/lib/openmpi/mca_errmgr_default.so (ignored)
> [fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
> /usr/local/lib/openmpi/mca_routed_cm: 
> dlopen(/usr/local/lib/openmpi/mca_routed_cm.so, 9): Symbol not found: 
> _orte_message_event_t_class
>   Referenced from: /usr/local/lib/openmpi/mca_routed_cm.so
>   Expected in: flat namespace
>  in /usr/local/lib/openmpi/mca_routed_cm.so (ignored)
> [fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
> /usr/local/lib/openmpi/mca_routed_linear: 
> dlopen(/usr/local/lib/openmpi/mca_routed_linear.so, 9): Symbol not found: 
> _orte_message_event_t_class
>   Referenced from: /usr/local/lib/openmpi/mca_routed_linear.so
>   Expected in: flat namespace
>  in /usr/local/lib/openmpi/mca_routed_linear.so (ignored)
> [fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
> /usr/local/lib/openmpi/mca_grpcomm_basic: 
> dlopen(/usr/local/lib/openmpi/mca_grpcomm_basic.so, 9): Symbol not found: 
> _opal_profile
>   Referenced from: /usr/local/lib/openmpi/mca_grpcomm_basic.so
>   Expected in: flat namespace
>  in /usr/local/lib/openmpi/mca_grpcomm_basic.so (ignored)
> [fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
> /usr/local/lib/openmpi/mca_grpcomm_hier: 
> dlopen(/usr/local/lib/openmpi/mca_grpcomm_hier.so, 9): Symbol not found: 
> _orte_daemon_cmd_processor
>   Referenced from: /usr/local/lib/openmpi/mca_grpcomm_hier.so
>   Expected in: flat namespace
>  in /usr/local/lib/openmpi/mca_grpcomm_hier.so (ignored)
> [fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
> /usr/local/lib/openmpi/mca_filem_rsh: 
> dlopen(/usr/local/lib/openmpi/mca_filem_rsh.so, 9): Symbol not found: 
> _opal_uses_threads
>   Referenced from: /usr/local/lib/openmpi/mca_filem_rsh.so
>   Expected in: flat namespace
>  in /usr/local/lib/openmpi/mca_filem_rsh.so (ignored)
> [fpmac114:00398] *** Process received signal ***
> [fpmac114:00398] Signal: Segmentation fault: 11 (11)
> [fpmac114:00398] Signal code: Address not mapped (1)
> [fpmac114:00398] Failing at address: 0x10013
> [fpmac114:00398] [ 0] 0   libsystem_platform.dylib
> 0x7fff933125aa _sigtramp + 26
> [fpmac114:00398] [ 1] 0   ??? 
> 0x7fff5b7f00ff 0x0 + 140734728438015
> [fpmac114:00398] [ 2] 0   libopen-rte.7.dylib 
> 0x000104469ee5 orte_rmaps_base_map_job + 1525
> [fpmac114:00398] [ 3] 0   libopen-pal.6.dylib 
> 0x0001044e4346 opal_libevent2021_event_base_loop + 2214
> [fpmac114:00398] [ 4] 0   mpirun  
> 0x000104411bc0 orterun + 6320
> [fpmac114:00398] [ 5] 0   mpirun  
> 0x0001044102f2 main + 34
> [fpmac114:00398] [ 6] 0   libdyld.dylib  

Re: [OMPI users] openmpi-1.8.4 on OSX, mpirun execution error.

2015-02-25 Thread Javier Mas Solé
hi Jeff, this was very helpful. Thank you very much indeed.  Only tweak is that 
I had to  type 

sudo rm .rf/usr/local/lib/openmpi

I reinstalled openmpi and the code is now running seamless. Thanks again.

Javier

On 25 Feb 2015, at 14:29, Jeff Squyres (jsquyres)  wrote:

> If you had an older Open MPI installed into /usr/local before you installed 
> Open MPI 1.8.4 into /usr/local, it's quite possible that some of the older 
> plugins are still there (and will not play nicely with the 1.8.4 install).
> 
> Specifically: installing a new Open MPI does not uninstall an older Open MPI.
> 
> What you can probably do is
> 
>rm -rf /usr/local/lib/openmpi
> 
> This will completely delete *all* Open MPI plugins (both new and old) from 
> the /usr/local tree.
> 
> Then re-install the 1.8.4 again, and see if that works for you.
> 
> 
> 
>> On Feb 25, 2015, at 7:52 AM, Javier Mas Solé  
>> wrote:
>> 
>> I have a fresh install of openmpi-1.8.4 in a  Mac with OSX-10.9.5. It 
>> compiled and installed fine. 
>> I have a Fortran code that runs perfectly on another similar machine with 
>> openmpi-1.6.5. It compiled
>> without error in  the new Mac. When I want to  mpirun, it gives the 
>> following  message below.
>> 
>> Also if i write echo $PATH  I can spot the combinations us/local/bin which 
>> was a warning in the installation instructions.
>> I have read in other forums that this might signal a duplicity of versions 
>> openmpi. I cannot rule this out, although
>> I don’t find any duplicate in the use/local/bin folder.
>> 
>> I’m thinking of uninstalling this version and installing the 1.6.5 which 
>> works fine. 
>> ¿Anyone can tell me how to do this uninstall?
>> 
>> Thanks a lot
>> 
>> Javier
>> 
>> I have seen a similar post to this one 
>> 
>> fpmac114:AdSHW javier$ /usr/local/bin/mpirun sim1.exe
>> [fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
>> /usr/local/lib/openmpi/mca_ess_slurmd: 
>> dlopen(/usr/local/lib/openmpi/mca_ess_slurmd.so, 9): Symbol not found: 
>> _orte_jmap_t_class
>>  Referenced from: /usr/local/lib/openmpi/mca_ess_slurmd.so
>>  Expected in: flat namespace
>> in /usr/local/lib/openmpi/mca_ess_slurmd.so (ignored)
>> [fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
>> /usr/local/lib/openmpi/mca_errmgr_default: 
>> dlopen(/usr/local/lib/openmpi/mca_errmgr_default.so, 9): Symbol not found: 
>> _orte_errmgr_base_error_abort
>>  Referenced from: /usr/local/lib/openmpi/mca_errmgr_default.so
>>  Expected in: flat namespace
>> in /usr/local/lib/openmpi/mca_errmgr_default.so (ignored)
>> [fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
>> /usr/local/lib/openmpi/mca_routed_cm: 
>> dlopen(/usr/local/lib/openmpi/mca_routed_cm.so, 9): Symbol not found: 
>> _orte_message_event_t_class
>>  Referenced from: /usr/local/lib/openmpi/mca_routed_cm.so
>>  Expected in: flat namespace
>> in /usr/local/lib/openmpi/mca_routed_cm.so (ignored)
>> [fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
>> /usr/local/lib/openmpi/mca_routed_linear: 
>> dlopen(/usr/local/lib/openmpi/mca_routed_linear.so, 9): Symbol not found: 
>> _orte_message_event_t_class
>>  Referenced from: /usr/local/lib/openmpi/mca_routed_linear.so
>>  Expected in: flat namespace
>> in /usr/local/lib/openmpi/mca_routed_linear.so (ignored)
>> [fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
>> /usr/local/lib/openmpi/mca_grpcomm_basic: 
>> dlopen(/usr/local/lib/openmpi/mca_grpcomm_basic.so, 9): Symbol not found: 
>> _opal_profile
>>  Referenced from: /usr/local/lib/openmpi/mca_grpcomm_basic.so
>>  Expected in: flat namespace
>> in /usr/local/lib/openmpi/mca_grpcomm_basic.so (ignored)
>> [fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
>> /usr/local/lib/openmpi/mca_grpcomm_hier: 
>> dlopen(/usr/local/lib/openmpi/mca_grpcomm_hier.so, 9): Symbol not found: 
>> _orte_daemon_cmd_processor
>>  Referenced from: /usr/local/lib/openmpi/mca_grpcomm_hier.so
>>  Expected in: flat namespace
>> in /usr/local/lib/openmpi/mca_grpcomm_hier.so (ignored)
>> [fpmac114.inv.usc.es:00398] mca: base: component_find: unable to open 
>> /usr/local/lib/openmpi/mca_filem_rsh: 
>> dlopen(/usr/local/lib/openmpi/mca_filem_rsh.so, 9): Symbol not found: 
>> _opal_uses_threads
>>  Referenced from: /usr/local/lib/openmpi/mca_filem_rsh.so
>>  Expected in: flat namespace
>> in /usr/local/lib/openmpi/mca_filem_rsh.so (ignored)
>> [fpmac114:00398] *** Process received signal ***
>> [fpmac114:00398] Signal: Segmentation fault: 11 (11)
>> [fpmac114:00398] Signal code: Address not mapped (1)
>> [fpmac114:00398] Failing at address: 0x10013
>> [fpmac114:00398] [ 0] 0   libsystem_platform.dylib
>> 0x7fff933125aa _sigtramp + 26
>> [fpmac114:00398] [ 1] 0   ??? 
>> 0x7fff5b7f00ff 0x0 + 140734728438015
>> [fpmac114:00398] [ 2] 0   libopen-rte.7.dylib 
>> 0x000104469ee5 orte_rmaps_base_map_job +

Re: [OMPI users] MPIIO and OrangeFS

2015-02-25 Thread Edgar Gabriel

Two separate comments.

1. I do not know the precise status of the PVFS2 support in 1.8 series 
of Open MPI for ROMIO, I haven't tested it in a while. On master, I know 
that there is a compilation problem with PVFS2 and ROMIO on Open MPI and 
I am about to submit a report/question to ROMIO about that.


2. for OMPIO, we use PVFS2 as our main development platform. However, we 
have honestly not tried to use PVFS2 without the file system being 
mounted (i.e. we do rely on the kernel component to some extent).  Yes, 
internally we use the library interfaces of PVFS2, but we use the file 
system information to determine the type of the file system, and my 
guess is that if that information is not available, the pvfs2 fs (and 
fbtl for that matter) components disable themselves, and that's the 
error that you see. I can look into how to make that scenario work in 
OMPIO, but its definitely not in the 1.8 series.


Thanks
Edgar

On 2/25/2015 2:01 AM, vithanousek wrote:

Thanks for your repaly!

I checked my configuration parametrs and it seem, that everything is correct:
./configure --prefix=/opt/modules/openmpi-1.8.4 --with-sge --with-psm 
--with-pvfs2=/opt/orangefs 
--with-io-romio-flags='--with-file-system=pvfs2+ufs+nfs 
--with-pvfs2=/opt/orangefs'

I have added error chceking code to my app, and I was getting multiple errors, 
like en MPI_ERR_AMODE, MPI_ERR_UNKNOWN, MPI_ERR_NO_SUCH_FILE,MPI_ERR_IO. 
(depend on permisions of mount point of pvfs2, and --mca io romio/ompio --mca 
fs pvfs2)

But it seems that error is in sourcecode of my application, because I cant find 
any more complex documentation about using ROMIO and OMPIO.
I found here https://surfsara.nl/systems/lisa/software/pvfs2, that I should use as filename 
"pvfs2:/pvfs_mount_point/name_of_file" instead of 
"/pvfs_mount_point/name_of_file". This is working with ROMIO.

Do you know how to use OMPIO without mounting pvfs2? if I tryed the same filename format 
as in ROMIO I got "MPI_ERR_FILE: invalid file".
If I use normal filename format ("/mountpoint/filename") and force use of pvfs2 
by using  --mca io ompio --mca fs pvfs2, then my app fails with
mca_fs_base_file_select() failed (and backtrace).

At OrangeFS documentation (http://docs.orangefs.com/v_2_8_8/index.htm) is 
chapter about using ROMIO, and it says, that i shoud compile apps with -lpvfs2. 
I have tryed it, but nothing change (ROMIO works with special filename format, 
OMPIO doesnt work)

Thanks for your help. If you point me to some usefull documentation, I will be 
happy.
Hanousek Vít


-- Původní zpráva --
Od: Rob Latham
Komu: us...@open-mpi.org, vithanou...@seznam.cz
Datum: 24. 2. 2015 22:10:08
Předmět: Re: [OMPI users] MPIIO and OrangeFS

On 02/24/2015 02:00 PM, vithanousek wrote:

Hello,

Im not sure if I have my OrangeFS (2.8.8) and OpenMPI (1.8.4) set up corectly. 
One short questin?

Is it needed to have OrangeFS  mounted  through kernel module, if I want use 
MPIIO?


nope!


My simple MPIIO hello world program doesnt work, If i havent mounted OrangeFS. 
When I mount OrangeFS, it works. So I'm not sure if OMPIO (or ROMIO) is using 
pvfs2 servers directly or if it is using kernel module.

Sorry for stupid question, but I didnt find any documentation about it.


http://www.pvfs.org/cvs/pvfs-2-8-branch-docs/doc/pvfs2-quickstart/pvfs2-quickstart.php#sec:romio

It sounds like you have not configured your MPI implementation with
PVFS2 support (OrangeFS is a re-branding of PVFS2, but as far as MPI-IO
is concerned, they are the same).

OpenMPI passes flags to romio like this at configure time:

   --with-io-romio-flags="--with-file-system=pvfs2+ufs+nfs"

I'm not sure how OMPIO takes flags.

If pvfs2-ping and pvfs2-cp and pvfs2-ls work, then you can bypass the
kernel.

also, please check return codes:

http://stackoverflow.com/questions/22859269/what-do-mpi-io-error-codes-mean/26373193#26373193

==rob



Thanks for replays
Hanousek Vít
___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/02/26382.php





--
Edgar Gabriel
Associate Professor
Parallel Software Technologies Lab  http://pstl.cs.uh.edu
Department of Computer Science  University of Houston
Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA
Tel: +1 (713) 743-3857  Fax: +1 (713) 743-3335


Re: [OMPI users] MPIIO and OrangeFS

2015-02-25 Thread Rob Latham



On 02/25/2015 02:01 AM, vithanousek wrote:


Do you know how to use OMPIO without mounting pvfs2? if I tryed the same filename format 
as in ROMIO I got "MPI_ERR_FILE: invalid file".
If I use normal filename format ("/mountpoint/filename") and force use of pvfs2 
by using  --mca io ompio --mca fs pvfs2, then my app fails with
mca_fs_base_file_select() failed (and backtrace).


Sorry, I forgot to mention the importance (to ROMIO) of the file system 
prefix.  ROMIO can detect file systems two ways:

- either by using stat
- or by consulting a "file system prefix"

For PVFS2 or OrangeFS, prefixing the file name with 'pvfs2:' will tell 
ROMIO "treat this file like a PVFS2 file", and ROMIO will use the 
"system interface" to PVFS2/OrangeFS.


This file system prefix is described in the MPI standard and has proven 
useful in many situations.


Edgar has drawn the build error of OMPI-master to my attention.  I'll 
get that fixed straightaway.

==rob



At OrangeFS documentation (http://docs.orangefs.com/v_2_8_8/index.htm) is 
chapter about using ROMIO, and it says, that i shoud compile apps with -lpvfs2. 
I have tryed it, but nothing change (ROMIO works with special filename format, 
OMPIO doesnt work)

Thanks for your help. If you point me to some usefull documentation, I will be 
happy.
Hanousek Vít


-- Původní zpráva --
Od: Rob Latham
Komu: us...@open-mpi.org, vithanou...@seznam.cz
Datum: 24. 2. 2015 22:10:08
Předmět: Re: [OMPI users] MPIIO and OrangeFS

On 02/24/2015 02:00 PM, vithanousek wrote:

Hello,

Im not sure if I have my OrangeFS (2.8.8) and OpenMPI (1.8.4) set up corectly. 
One short questin?

Is it needed to have OrangeFS  mounted  through kernel module, if I want use 
MPIIO?


nope!


My simple MPIIO hello world program doesnt work, If i havent mounted OrangeFS. 
When I mount OrangeFS, it works. So I'm not sure if OMPIO (or ROMIO) is using 
pvfs2 servers directly or if it is using kernel module.

Sorry for stupid question, but I didnt find any documentation about it.


http://www.pvfs.org/cvs/pvfs-2-8-branch-docs/doc/pvfs2-quickstart/pvfs2-quickstart.php#sec:romio

It sounds like you have not configured your MPI implementation with
PVFS2 support (OrangeFS is a re-branding of PVFS2, but as far as MPI-IO
is concerned, they are the same).

OpenMPI passes flags to romio like this at configure time:

   --with-io-romio-flags="--with-file-system=pvfs2+ufs+nfs"

I'm not sure how OMPIO takes flags.

If pvfs2-ping and pvfs2-cp and pvfs2-ls work, then you can bypass the
kernel.

also, please check return codes:

http://stackoverflow.com/questions/22859269/what-do-mpi-io-error-codes-mean/26373193#26373193

==rob



Thanks for replays
Hanousek Vít
___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/02/26382.php





--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA


Re: [OMPI users] machinefile binding error

2015-02-25 Thread Galloway, Jack D
--bind-to none worked, ran just fine.  Additionally –hetero-nodes also worked 
without error.  However hetero-nodes didn’t allow threading properly while 
bind-to none did.

Is this the best option forward, adding that on all mpirun command lines or 
setting some system variables?  Or alternatively, would this work to avoid 
command line specification or environment variables?:

When you install OMPI, an "etc" directory gets created under the prefix 
location. In that directory is a file "openmpi-mca-params.conf". This is your 
default MCA param file that mpirun (and every OMPI process) reads on startup. 
You can put any params in there that you want. In this case, you'd add a line:

hwloc_base_binding_policy = none
from, http://www.open-mpi.org/community/lists/users/2014/05/24467.php

Thanks for the help,
--Jack


From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain
Sent: Tuesday, February 24, 2015 3:24 PM
To: Open MPI Users
Subject: Re: [OMPI users] machinefile binding error

It looks to me like some of the nodes don’t have the required numactl packages 
installed. Why don’t you try launching the job without binding, just to see if 
everything works?

Just add “—bind-to none” to your cmd line and see if things work


On Feb 24, 2015, at 2:21 PM, Galloway, Jack D 
mailto:ja...@lanl.gov>> wrote:

I think the error may be due to a new architecture change (brought on perhaps 
by the intel compilers?).  Bad wording here, but I’m really stumbling.  As I 
add processors to the mpirun hostname call, at ~100 processors I get the 
following error, which may be informative to more seasoned eyes.  Additionally, 
I’ve attached the config.log in case something stands out, grepping on 
“catastrophic error” gives not too many results, but I don’t know if the error 
may be there or more subtle.

--
WARNING: a request was made to bind a process. While the system
supports binding the process itself, at least one node does NOT
support binding memory to the process location.

  Node:  tebow124

This usually is due to not having the required NUMA support installed
on the node. In some Linux distributions, the required support is
contained in the libnumactl and libnumactl-devel packages.
This is a warning only; your job will continue, though performance may be 
degraded.
--
tebow
--
Open MPI tried to bind a new process, but something went wrong.  The
process was killed without launching the target application.  Your job
will now abort.

  Local host:tebow125
  Application name:  /bin/hostname
  Error message: hwloc_set_cpubind returned "Error" for bitmap "8,24"
  Location:  odls_default_module.c:551
--

Thanks,
--Jack




From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Galloway, Jack D
Sent: Tuesday, February 24, 2015 2:31 PM
To: Open MPI Users
Subject: Re: [OMPI users] machinefile binding error

Thank you sir, that fixed the first problem, hopefully the second is as easy!

I still get the second error when trying to farm out on a “large” number of 
processors:

machine file (“mach_burn_24s”):
tebow
tebow121 slots=24
tebow122 slots=24
tebow123 slots=24
tebow124 slots=24
tebow125 slots=24
tebow126 slots=24
tebow127 slots=24
tebow128 slots=24
tebow129 slots=24
tebow130 slots=24
tebow131 slots=24
tebow132 slots=24
tebow133 slots=24
tebow134 slots=24
tebow135 slots=24

mpirun –np 361 –machinefile mach_burn_24s hostname

--
WARNING: a request was made to bind a process. While the system
supports binding the process itself, at least one node does NOT
support binding memory to the process location.

  Node:  tebow124

This usually is due to not having the required NUMA support installed
on the node. In some Linux distributions, the required support is
contained in the libnumactl and libnumactl-devel packages.
This is a warning only; your job will continue, though performance may be 
degraded.
--
--
A request was made to bind to that would result in binding more
processes than cpus on a resource:

   Bind to: NONE
   Node:tebow125
   #processes:  2
   #cpus:   1

You can override this protection by adding the "overload-allowed"
option to your binding directive.
--

All the compute nodes (tebow121-135) have 24+ cores on them.

Any ideas?  Thanks!

--Jack


From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain
Sent: Tuesday, February 24, 2015 1:57 PM
To: Open MPI Users
Subject: Re:

Re: [OMPI users] machinefile binding error

2015-02-25 Thread Ralph Castain
Okay, it sounds like the problem is that some nodes have numactl installed, and 
thus can perform binding, and some don’t. It also sounds like you’d prefer to 
not bind your procs at all as they are multi-threaded, and you want to have as 
many procs on a node as you do slots. You clearly recognize that this means the 
threads from the different procs will be competing against each other for cpu’s 
in that design.

Correct?

If so, then indeed add that line to the default MCA param file and you’re good 
to go.

However, if you’d like to avoid thread competition for cpu’s, then another way 
you could do this is to specify the number of cpu’s to assign to each proc. In 
other words, you can bind the proc to more than one cpu, assuming you have 
enough cpu’s to meet your needs.

For example, let’s say you have 48 cores on your machines, and you want to run 
24 procs on each host. Then you could add this to the cmd line:

—map-by slot:pe=2

This will cause mpirun to assign one proc to each slot, but to bind that proc 
to two cores. The binding is done sequentially so as to avoid assigning more 
than one proc to a given core. If there aren’t enough cores to do what you ask, 
then you’ll get an error.

The threads for that proc will be confined to the assigned cores, and so the 
threads from a given process will only compete with themselves.

HTH
Ralph



> On Feb 25, 2015, at 8:01 AM, Galloway, Jack D  wrote:
> 
> --bind-to none worked, ran just fine.  Additionally –hetero-nodes also worked 
> without error.  However hetero-nodes didn’t allow threading properly while 
> bind-to none did.  
>  
> Is this the best option forward, adding that on all mpirun command lines or 
> setting some system variables?  Or alternatively, would this work to avoid 
> command line specification or environment variables?:
> When you install OMPI, an "etc" directory gets created under the prefix 
> location. In that directory is a file "openmpi-mca-params.conf". This is your 
> default MCA param file that mpirun (and every OMPI process) reads on startup. 
> You can put any params in there that you want. In this case, you'd add a 
> line: 
> 
> hwloc_base_binding_policy = none 
> 
> from, http://www.open-mpi.org/community/lists/users/2014/05/24467.php 
> 
>  
> Thanks for the help,
> --Jack
>  
>  
> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain
> Sent: Tuesday, February 24, 2015 3:24 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] machinefile binding error
>  
> It looks to me like some of the nodes don’t have the required numactl 
> packages installed. Why don’t you try launching the job without binding, just 
> to see if everything works?
>  
> Just add “—bind-to none” to your cmd line and see if things work
>  
>  
> On Feb 24, 2015, at 2:21 PM, Galloway, Jack D  > wrote:
>  
> I think the error may be due to a new architecture change (brought on perhaps 
> by the intel compilers?).  Bad wording here, but I’m really stumbling.  As I 
> add processors to the mpirun hostname call, at ~100 processors I get the 
> following error, which may be informative to more seasoned eyes.  
> Additionally, I’ve attached the config.log in case something stands out, 
> grepping on “catastrophic error” gives not too many results, but I don’t know 
> if the error may be there or more subtle.
>  
> --
> WARNING: a request was made to bind a process. While the system
> supports binding the process itself, at least one node does NOT
> support binding memory to the process location.
>  
>   Node:  tebow124
>  
> This usually is due to not having the required NUMA support installed
> on the node. In some Linux distributions, the required support is
> contained in the libnumactl and libnumactl-devel packages.
> This is a warning only; your job will continue, though performance may be 
> degraded.
> --
> tebow
> --
> Open MPI tried to bind a new process, but something went wrong.  The
> process was killed without launching the target application.  Your job
> will now abort.
>  
>   Local host:tebow125
>   Application name:  /bin/hostname
>   Error message: hwloc_set_cpubind returned "Error" for bitmap "8,24"
>   Location:  odls_default_module.c:551
> --
>  
> Thanks,
> --Jack
>  
>  
>  
>  
> From: users [mailto:users-boun...@open-mpi.org 
> ] On Behalf Of Galloway, Jack D
> Sent: Tuesday, February 24, 2015 2:31 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] machinefile binding error
>  
> Thank you sir, that fixed the first problem, hopefully the second is as easy!
>  
> I still get the secon

Re: [OMPI users] MPIIO and OrangeFS

2015-02-25 Thread vithanousek
Thanks very much for your replay, which clear all my problems.

For your Information, I have OpenMPI 1.8.4 with OrangeFS 2.8.8 support, through 
both OMPIO and ROMIO.
Now Im sure it works corectly. But I havent made any deep tests.

I'm not sure if it is usefull to use pvfs2 without mounting, but it could be 
useful if OMPIO supports same filesystem prefix as ROMIO, specialy for 
compatibility of modules.
The second thing is, that the prefixes and filename format should be documented 
somewhere (as it is mentioned in standard). I didnt find it. I excepted, that 
it will be in MPI_File_open funcion documentation. 

But these things is only details. Once more thank for your replay
Hanousek Vít

-- Původní zpráva --
Od: Edgar Gabriel 
Komu: us...@open-mpi.org
Datum: 25. 2. 2015 16:02:22
Předmět: Re: [OMPI users] MPIIO and OrangeFS

Two separate comments.

1. I do not know the precise status of the PVFS2 support in 1.8 series 
of Open MPI for ROMIO, I haven't tested it in a while. On master, I know 
that there is a compilation problem with PVFS2 and ROMIO on Open MPI and 
I am about to submit a report/question to ROMIO about that.

2. for OMPIO, we use PVFS2 as our main development platform. However, we 
have honestly not tried to use PVFS2 without the file system being 
mounted (i.e. we do rely on the kernel component to some extent).  Yes, 
internally we use the library interfaces of PVFS2, but we use the file 
system information to determine the type of the file system, and my 
guess is that if that information is not available, the pvfs2 fs (and 
fbtl for that matter) components disable themselves, and that's the 
error that you see. I can look into how to make that scenario work in 
OMPIO, but its definitely not in the 1.8 series.

Thanks
Edgar

On 2/25/2015 2:01 AM, vithanousek wrote:
> Thanks for your repaly!
>
> I checked my configuration parametrs and it seem, that everything is correct:
> ./configure --prefix=/opt/modules/openmpi-1.8.4 --with-sge --with-psm 
> --with-pvfs2=/opt/orangefs 
> --with-io-romio-flags='--with-file-system=pvfs2+ufs+nfs 
> --with-pvfs2=/opt/orangefs'
>
> I have added error chceking code to my app, and I was getting multiple 
> errors, like en MPI_ERR_AMODE, MPI_ERR_UNKNOWN, 
> MPI_ERR_NO_SUCH_FILE,MPI_ERR_IO. (depend on permisions of mount point of 
> pvfs2, and --mca io romio/ompio --mca fs pvfs2)
>
> But it seems that error is in sourcecode of my application, because I cant 
> find any more complex documentation about using ROMIO and OMPIO.
> I found here https://surfsara.nl/systems/lisa/software/pvfs2, that I should 
> use as filename "pvfs2:/pvfs_mount_point/name_of_file" instead of 
> "/pvfs_mount_point/name_of_file". This is working with ROMIO.
>
> Do you know how to use OMPIO without mounting pvfs2? if I tryed the same 
> filename format as in ROMIO I got "MPI_ERR_FILE: invalid file".
> If I use normal filename format ("/mountpoint/filename") and force use of 
> pvfs2 by using  --mca io ompio --mca fs pvfs2, then my app fails with
> mca_fs_base_file_select() failed (and backtrace).
>
> At OrangeFS documentation (http://docs.orangefs.com/v_2_8_8/index.htm) is 
> chapter about using ROMIO, and it says, that i shoud compile apps with 
> -lpvfs2. I have tryed it, but nothing change (ROMIO works with special 
> filename format, OMPIO doesnt work)
>
> Thanks for your help. If you point me to some usefull documentation, I will 
> be happy.
> Hanousek Vít
>
>
> -- Původní zpráva --
> Od: Rob Latham
> Komu: us...@open-mpi.org, vithanou...@seznam.cz
> Datum: 24. 2. 2015 22:10:08
> Předmět: Re: [OMPI users] MPIIO and OrangeFS
>
> On 02/24/2015 02:00 PM, vithanousek wrote:
>> Hello,
>>
>> Im not sure if I have my OrangeFS (2.8.8) and OpenMPI (1.8.4) set up 
>> corectly. One short questin?
>>
>> Is it needed to have OrangeFS  mounted  through kernel module, if I want use 
>> MPIIO?
>
> nope!
>
>> My simple MPIIO hello world program doesnt work, If i havent mounted 
>> OrangeFS. When I mount OrangeFS, it works. So I'm not sure if OMPIO (or 
>> ROMIO) is using pvfs2 servers directly or if it is using kernel module.
>>
>> Sorry for stupid question, but I didnt find any documentation about it.
>
> http://www.pvfs.org/cvs/pvfs-2-8-branch-docs/doc/pvfs2-quickstart/pvfs2-quickstart.php#sec:romio
>
> It sounds like you have not configured your MPI implementation with
> PVFS2 support (OrangeFS is a re-branding of PVFS2, but as far as MPI-IO
> is concerned, they are the same).
>
> OpenMPI passes flags to romio like this at configure time:
>
>--with-io-romio-flags="--with-file-system=pvfs2+ufs+nfs"
>
> I'm not sure how OMPIO takes flags.
>
> If pvfs2-ping and pvfs2-cp and pvfs2-ls work, then you can bypass the
> kernel.
>
> also, please check return codes:
>
> http://stackoverflow.com/questions/22859269/what-do-mpi-io-error-codes-mean/26373193#26373193
>
> ==rob
>
>
>> Thanks for replays
>> Hanousek Vít
>> ___
>> 

Re: [OMPI users] MPIIO and OrangeFS

2015-02-25 Thread vithanousek
Thanks you for your replay.

As conclusion I summarrised these things here, for another future users.

in OpenMPI 1.8.4 is MPIIO supported by both modules ROMIO and OMPIO and both 
modules support OrangeFS 2.8.8 (as PVFS2).
OpenMPI should be compilated with :
./configure --prefix=/opt/modules/openmpi-1.8.4 --with-sge --with-psm 
--with-pvfs2=/opt/orangefs 
--with-io-romio-flags='--with-file-system=pvfs2+ufs+nfs 
--with-pvfs2=/opt/orangefs'

When you are using ROMIO module (--mca io romio), then PVFS2 filesystem doesnt 
need to be mounted, but you need use prefix "pvfs2:" in filename(i.e. 
"pvfs2:/path_to_data/filename"). If PVFS2 is mounted, prefix is not needed.

When you are using OMPIO module (--mca io ompio), then PVFS2 filesystem must be 
mounted. OMPIO will use PVFS2 directly. Mounted PVFS2 is used for decision on 
what filesystem is the file placed. The prefix in filename is not supported.

Thanks
Hanousek Vít

-- Původní zpráva --
Od: Rob Latham 
Komu: us...@open-mpi.org
Datum: 25. 2. 2015 16:54:12
Předmět: Re: [OMPI users] MPIIO and OrangeFS

On 02/25/2015 02:01 AM, vithanousek wrote:

> Do you know how to use OMPIO without mounting pvfs2? if I tryed the same 
> filename format as in ROMIO I got "MPI_ERR_FILE: invalid file".
> If I use normal filename format ("/mountpoint/filename") and force use of 
> pvfs2 by using  --mca io ompio --mca fs pvfs2, then my app fails with
> mca_fs_base_file_select() failed (and backtrace).

Sorry, I forgot to mention the importance (to ROMIO) of the file system 
prefix.  ROMIO can detect file systems two ways:
- either by using stat
- or by consulting a "file system prefix"

For PVFS2 or OrangeFS, prefixing the file name with 'pvfs2:' will tell 
ROMIO "treat this file like a PVFS2 file", and ROMIO will use the 
"system interface" to PVFS2/OrangeFS.

This file system prefix is described in the MPI standard and has proven 
useful in many situations.

Edgar has drawn the build error of OMPI-master to my attention.  I'll 
get that fixed straightaway.
==rob

>
> At OrangeFS documentation (http://docs.orangefs.com/v_2_8_8/index.htm) is 
> chapter about using ROMIO, and it says, that i shoud compile apps with 
> -lpvfs2. I have tryed it, but nothing change (ROMIO works with special 
> filename format, OMPIO doesnt work)
>
> Thanks for your help. If you point me to some usefull documentation, I will 
> be happy.
> Hanousek Vít
>
>
> -- Původní zpráva --
> Od: Rob Latham
> Komu: us...@open-mpi.org, vithanou...@seznam.cz
> Datum: 24. 2. 2015 22:10:08
> Předmět: Re: [OMPI users] MPIIO and OrangeFS
>
> On 02/24/2015 02:00 PM, vithanousek wrote:
>> Hello,
>>
>> Im not sure if I have my OrangeFS (2.8.8) and OpenMPI (1.8.4) set up 
>> corectly. One short questin?
>>
>> Is it needed to have OrangeFS  mounted  through kernel module, if I want use 
>> MPIIO?
>
> nope!
>
>> My simple MPIIO hello world program doesnt work, If i havent mounted 
>> OrangeFS. When I mount OrangeFS, it works. So I'm not sure if OMPIO (or 
>> ROMIO) is using pvfs2 servers directly or if it is using kernel module.
>>
>> Sorry for stupid question, but I didnt find any documentation about it.
>
> http://www.pvfs.org/cvs/pvfs-2-8-branch-docs/doc/pvfs2-quickstart/pvfs2-quickstart.php#sec:romio
>
> It sounds like you have not configured your MPI implementation with
> PVFS2 support (OrangeFS is a re-branding of PVFS2, but as far as MPI-IO
> is concerned, they are the same).
>
> OpenMPI passes flags to romio like this at configure time:
>
>--with-io-romio-flags="--with-file-system=pvfs2+ufs+nfs"
>
> I'm not sure how OMPIO takes flags.
>
> If pvfs2-ping and pvfs2-cp and pvfs2-ls work, then you can bypass the
> kernel.
>
> also, please check return codes:
>
> http://stackoverflow.com/questions/22859269/what-do-mpi-io-error-codes-mean/26373193#26373193
>
> ==rob
>
>
>> Thanks for replays
>> Hanousek Vít
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2015/02/26382.php
>>
>

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA
___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/02/26398.php