Hi Jack/Jinxu

Jack Bryan wrote:
Dear All,

I am working on a multi-computer Open MPI cluster system. If I put some data files in /home/mypath/folder, is it possible that all non-head nodes can access the files in the folder ?

Yes, possible, for instance, if the /home/mypath/folder directory is
NFS mounted on all nodes/computers.
Otherwise, if all disks and directories are local to each computer,
you need to copy the input files to the local disks before you
start, and copy the output files back to your login computer after the
program ends.

I need to load some data to some nodes, if all nodes can access the data, I do not need to load them to each node one by one. If multiple nodes access the same file to get data, is there conflict ?

To some extent.
The OS (on the computer where the file is located)
will do the arbitration on which process gets the hold of the file at each time.
If you have 1000 processes, this means a lot of arbitration,
and most likely contention.
Even for two processes only, if the processes are writing data to a single file, this won't ensure that they write
the output data in the order that you want.

For example, fopen(myFile) by node 1, at the same time fopen(myFile) by node 2. Is it allowed to do that on MPI cluster without conflict ?

I think MPI won't have any control over this.
It is up to the operational system, and depends on
which process gets its "fopen" request to the OS first,
which is not a deterministic sequence of events.
That is not a clean technique.

You could instead:

1) Assign a single process, say, rank 0,
to read and write data from/to the file(s).
Then use, say, MPI_Scatter[v] and MPI_Gather[v],
to distribute and collect the data back and forth
between that process (rank 0) and all other processes.

That is an old fashioned but very robust technique.
It avoids any I/O conflict or contention among processes.
All the data flows across the processes via MPI.
The OS receives I/O requests from a single process (rank 0).

Besides MPI_Gather/MPI_Scatter, look also at MPI_Bcast,
if you need to send the same data to all processes,
assuming the data is being read by a single process.

2) Alternatively, you could use the MPI I/O functions,
if your files are binary.

I hope it helps,
Gus Correa

Any help is appreciated.
Jinxu Ding

July 12  2010

------------------------------------------------------------------------
The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with Hotmail. Get busy. <http://www.windowslive.com/campaign/thenewbusy?tile=multicalendar&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5>


------------------------------------------------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to