Re: [OMPI users] OpenMPI + Hadoop

Ralph Castain Mon, 24 Feb 2014 12:25:21 -0500 (EST)

On Feb 24, 2014, at 7:55 AM, Saliya Ekanayake <esal...@gmail.com> wrote:


> This is very interesting. I've been working on getting one of our clustering 
> programs (http://grids.ucs.indiana.edu/ptliupages/publications/DAVS_IEEE.pdf) 
> to work with OpenMPI Java binding and we obtained very good speedup and 
> scalability when run on HPC clusters with Infiniband. We are working on a 
> report with performance results and will make it available here soon. 

Great! Will look forward to seeing it.

> 
> This is again interesting as we have a series of MapReduce applications that 
> we have developed in analyzing gene sequences 
> (http://grids.ucs.indiana.edu/ptliupages/publications/DACIDR_camera_ready_v0.3.pdf),
>  which could benefit from having MPI support. Also, as you have mentioned, we 
> run all these MapReduce jobs on HPC clusters.

The folks at TACC are doing the Intel beta on a mouse genome, and will also be 
publishing their results comparing Hadoop performance under YARN/HDFS vs 
Slurm/Lustre.

> 
> I am very eager to try 4.) and wonder if you could kindly provide some 
> pointers on how to get it working.

The current release contains the initial "staged" execution support, but not 
the dynamic extension I described. To use staged execution, all you have to do 
is:

(a) express your mapper and reducer stages as separate app_contexts on the 
command line; and

(b) add --staged to the cmd line to request staged execution.

So it looks something like this:

mpirun --staged -n 10 ./mapper; -n 4 ./reducer

Depending on the allocation, mpirun will stage execution of the mappers and 
reducers, connecting the stdout of the first to the stdin of the second. There 
is also support for localized file systems (see the orte/mca/dfs framework) 
that allows you to transparently access/move data across the network, and of 
course mpirun supports pre-positioning of files via the --preload-files option.

HTH - feel free to ask questions and we'll be happy to help. Also, if you want 
to collaborate on the dynamic extension, we'd welcome the assist. Both Jeff and 
I have been somewhat swamped with other priorities and so progress on that last 
step is lagging.

Ralph

> 
> Thank you,
> Saliya
> 
> 
> 
> On Mon, Feb 24, 2014 at 10:30 AM, Ralph Castain <r...@open-mpi.org> wrote:
> 
> On Feb 23, 2014, at 10:42 AM, Saliya Ekanayake <esal...@gmail.com> wrote:
> 
>> Hi,
>> 
>> This is to get some info on the subject and not directly a question on 
>> OpenMPI.
>> 
>> I've Jeff's blog post on integrating OpenMPI with Hadoop 
>> (http://blogs.cisco.com/performance/resurrecting-mpi-and-java/) and wanted 
>> to check if this is related with the Jira at 
>> https://issues.apache.org/jira/browse/MAPREDUCE-2911
> 
> Somewhat. A little history might help. I was asked a couple of years ago to 
> work on integrating MPI support with Hadoop. At that time, the thought of 
> those asking for my help was that we would enable YARN to support MPI, which 
> was captured in 2911. However, after working on it for a few months, it 
> became apparent to me that this was a mistake. YARN's architecture makes 
> support of MPI very difficult (but achievable - I did it with OMPI, and 
> someone else has now done it with MPICH), and the result exhibits horrible 
> scaling and relatively poor performance by HPC standards. So if you want to 
> run a very small MPI job under YARN, you can do it with a custom application 
> manager and JNI wrappers around every MPI call - just don't expect great 
> performance.
> 
> What I did instead was to pivot direction and focus on porting Hadoop to the 
> HPC environment. Thought here was that, if we could get the Hadoop classes 
> working with a regular HPC environment, then all the HPC world's tools and 
> programming models become available. This is what we have done, and it comes 
> in four parts:
> 
> 1. Java MPI bindings that are very close to C-level performance. These are 
> being released in the 1.7 series of OMPI and are unique to OMPI at this time. 
> Jose Roman and Oscar Vega continue to close the performance gap.
> 
> 2. Integration to HPC resource managers such as Slurm and Moab. Intel has 
> taken the lead there and announced that support at SC13 - in beta test now
> 
> 3. Integration to HPC file systems such as Lustre. Intel again took the lead 
> here and has a Lustre adaptor in beta test
> 
> 4. Equivalent of an application manager to stage map-reduce executions. I 
> updated OMPI's "mpirun" to handle that - available in the current 1.7 release 
> series. It fully understands "staged" execution and also notifies the 
> associated processes when MPI is feasible (i.e., all the procs in comm_world 
> are running).
> 
> We continue to improve the Hadoop support - Cisco and I are collaborating on 
> a new "dynamic MPI" capability that will allow the procs to interact without 
> imposing the barrier at MPI_Init, for example. So I expect that this summer 
> will demonstrate a pretty robust capability in that area.
> 
> After all, there is no reason you shouldn't be able to run Hadoop on an HPC 
> cluster :-)
> 
> HTH
> Ralph
> 
>> 
>> Also, is there a place I can get more info on this effort?
>> 
>> Thank you,
>> Saliya
>> 
>> -- 
>> Saliya Ekanayake esal...@gmail.com 
>> Cell 812-391-4914 Home 812-961-6383
>> http://saliya.org
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> Saliya Ekanayake esal...@gmail.com 
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] OpenMPI + Hadoop

Reply via email to