from:"Rayson Ho"

Re: [OMPI users] Related to project ideas in OpenMPI

2011-08-25 Thread Rayson Ho

Srinivas, There's also Kernel-Level Checkpointing vs. User-Level Checkpointing - if you can checkpoint an MPI task and restart it on a new node, then this is also "process migration". Of course, doing a checkpoint & restart can be slower than pure in-kernel process migration, but the advantage is

Re: [OMPI users] Related to project ideas in OpenMPI

2011-08-25 Thread Rayson Ho

t; > Thanks and regards > Durga > > > On Thu, Aug 25, 2011 at 11:08 AM, Rayson Ho wrote: >> Srinivas, >> >> There's also Kernel-Level Checkpointing vs. User-Level Checkpointing - >> if you can checkpoint an MPI task and restart it on a new node, then >> th

Re: [OMPI users] How to add nodes while running job

2011-08-27 Thread Rayson Ho

On Sat, Aug 27, 2011 at 9:12 AM, Ralph Castain wrote: > OMPI has no way of knowing that you will turn the node on at some future > point. All it can do is try to launch the job on the provided node, which > fails because the node doesn't respond. > You'll have to come up with some scheme for telli

Re: [OMPI users] OpenMPI Nonblocking Send/Recv

2011-09-13 Thread Rayson Ho

Hi Xin, Since it is not Open MPI specific, you might want to try to work with the SciNet guys first. The "SciNet Research Computing Consulting Clinic" is specifically formed to help U of T students & researchers develop and design compute-intensive programs. http://www.scinet.utoronto.ca/ http://

Re: [OMPI users] Problem compiling openmpi-1.4.3

2011-09-13 Thread Rayson Ho

Did you notice the error message: /usr/bin/install: cannot remove `/opt/openmpi/share/openmpi/amca-param-sets/example.conf': Permission denied I would check the permission settings of the file first if I encounter something like this... Rayson = Grid Engine / O

Re: [OMPI users] Open MPI process cannot do send-receive message correctly on a distributed memory cluster

2011-09-30 Thread Rayson Ho

You can use a debugger (just gdb will do, no TotalView needed) to find out which MPI send & receive calls are hanging the code on the distributed cluster, and see if the send & receive pair is due to a problem described at: Deadlock avoidance in your MPI programs: http://www.cs.ucsb.edu/~hnielsen/

Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and 'get_mempolicy'

2011-12-07 Thread Rayson Ho

We are using hwloc-1.2.2 for topology binding in Open Grid Scheduler/Grid Engine 2011.11, and a user is encountering similar issues: http://gridengine.org/pipermail/users/2011-December/002126.html In Open MPI, there is the configure switch "--without-libnuma" to turn libnuma off. But since Open M

Re: [OMPI users] [Beowulf] How to justify the use MPI codes on multicore systems/PCs?

2011-12-12 Thread Rayson Ho

On Sat, Dec 10, 2011 at 3:21 PM, amjad ali wrote: > (2) The latest MPI implementations are intelligent enough that they use some > efficient mechanism while executing MPI based codes on shared memory > (multicore) machines. (please tell me any reference to quote this fact). Not an academic paper

Re: [OMPI users] How to justify the use MPI codes on multicore systems/PCs?

2011-12-14 Thread Rayson Ho

the compute nodes have to be > explicitly DMA'd in? Is there a middleware layer that makes it > transparent to the upper layer software? > > Best regards > Durga > > On Mon, Dec 12, 2011 at 11:00 AM, Rayson Ho wrote: >> On Sat, Dec 10, 2011 at 3:21 PM, amjad ali wrote:

Re: [OMPI users] Strange TCP latency results on Amazon EC2

2012-01-13 Thread Rayson Ho

On Tue, Jan 10, 2012 at 10:02 AM, Roberto Rey wrote: > I'm running some tests on EC2 cluster instances with 10 Gigabit Ethernet > hardware and I'm getting strange latency results with Netpipe and OpenMPI. - There are 3 types of instances that can use 10 GbE. Are you using "cc1.4xlarge", "cc2.8xla

Re: [OMPI users] Spawn_multiple with tight integration to SGE grid engine

2012-01-31 Thread Rayson Ho

On Mon, Jan 30, 2012 at 11:33 PM, Tom Bryan wrote: > For our use, yes, spawn_multiple makes sense. We won't be spawning lots and > lots of jobs in quick succession. We're using MPI as an robust way to get > IPC as we spawn multiple child processes while using SGE to help us with > load balancing

Re: [OMPI users] ROMIO Podcast

2012-02-20 Thread Rayson Ho

Brock, I listened to the podcast on Saturday, and I just downloaded it again 10 mins ago. Did the interview really end at 26:34?? And if I recall correctly, you & Jeff did not get a chance to ask them the "which source control system do you guys use" question :-D Rayson

Re: [OMPI users] ROMIO Podcast

2012-02-20 Thread Rayson Ho

> iTunes updates. > > > > On Feb 20, 2012, at 3:25 PM, Rayson Ho wrote: > >> Brock, >> >> I listened to the podcast on Saturday, and I just downloaded it again >> 10 mins ago. >> >> Did the interview really end at 26:34?? And if I recall corr

Re: [OMPI users] ROMIO Podcast

2012-02-20 Thread Rayson Ho

ins (i.e., it keeps playing after the timer >> reaches 0:00), and then it cuts off in the middle of one of Rajeev's >> answers. Doh. :-( >> >> Brock is checking into it… >> >> >> On Feb 20, 2012, at 4:37 PM, Rayson Ho wrote: >> >>> H

Re: [OMPI users] ROMIO Podcast

2012-02-20 Thread Rayson Ho

t; it's both longer than 33 mins (i.e., it keeps playing after the timer >> reaches 0:00), and then it cuts off in the middle of one of Rajeev's >> answers. Doh. :-( >> >> Brock is checking into it… >> >> >> On Feb 20, 2012, at 4:37 PM, Rayson Ho w

Re: [OMPI users] ROMIO Podcast

2012-02-20 Thread Rayson Ho

On Mon, Feb 20, 2012 at 6:02 PM, Jeffrey Squyres wrote: >> (But what's happened to the "what source control system do you guys >> use" question usually asked by Jeff? :-D ) > > > I need to get back to asking that one. :-) Skynet needs to send Jeff (and Arnold) back in time! > It's just a perso

Re: [OMPI users] ROMIO Podcast

2012-02-21 Thread Rayson Ho

On Tue, Feb 21, 2012 at 12:06 PM, Rob Latham wrote: > ROMIO's testing and performance regression framework is honestly a > shambles. Part of that is a challenge with the MPI-IO interface > itself. For MPI messaging you exercise the API and you have pretty > much covered everything. MPI-IO, thou

Re: [OMPI users] Segfaults w/ both 1.4 and 1.5 on CentOS 6.2/SGE

2012-03-15 Thread Rayson Ho

Hi Joshua, I don't think the new built-in rsh in later versions of Grid Engine is going to make any difference - the orted is the real starter of the MPI tasks and should have a greater influence on the task environment. However, it would help if you can record the nice values and resource limits

Re: [OMPI users] Error while loading shared libraries

2012-04-02 Thread Rayson Ho

On Sun, Apr 1, 2012 at 11:27 PM, Rohan Deshpande wrote: > error while loading shared libraries: libmpi.so.0: cannot open shared > object file no such object file: No such file or directory. Were you trying to run the MPI program on a remote machine?? If you are, then make sure that each machine

Re: [OMPI users] Sharing (not copying) data with OpenMPI?

2012-04-17 Thread Rayson Ho

On Tue, Apr 17, 2012 at 2:26 AM, jody wrote: > As to OpenMP: i already make use of OpenMP in some places (for > instance for the creation of the large data block), > but unfortunately my main application is not well suited for OpenMP > parallelization.. If MPI does not support this kind of progra

Re: [OMPI users] ppe-ompi 1.2 (Open MPI on EC2)

2012-04-23 Thread Rayson Ho

Is StarCluster too complex for your use case? http://web.mit.edu/star/cluster/ Rayson = Open Grid Scheduler / Grid Engine http://gridscheduler.sourceforge.net/ Scalable Grid Engine Support Program http://www.scalablelogic.com/ On Mon, Apr 23, 2012 at 6:20 PM,

Re: [OMPI users] HRM problem

2012-04-24 Thread Rayson Ho

Seems like there's a bug in the application. Did you or someone else write it, or did you get it from an ISV?? You can log onto one of the nodes, attach a debugger, and see if the MPI task is waiting for a message (looping in one of the MPI receive functions)... Rayson ==

Re: [OMPI users] MPI books and resources

2012-05-12 Thread Rayson Ho

And before you try to understand the OMPI code, read some of the papers & presentations first: http://www.open-mpi.org/papers/ Rayson Open Grid Scheduler / Grid Engine http://gridscheduler.sourceforge.net/ Scalable Grid Engine Support Program http://www.scalable

[OMPI users] OT: MPI Quiz...

2012-06-01 Thread Rayson Ho

We posted an MPI quiz but so far no one on the Grid Engine list has the answer that Jeff was expecting: http://blogs.scalablelogic.com/ Others have offered interesting points, and I just want to see if people on the Open MPI list have the *exact* answer and the first one gets a full Cisco Live C

Re: [OMPI users] Question on ./configure error on Tru64unix (OSF1) v5.1B-6 for openmpi-1.6

2012-06-08 Thread Rayson Ho

Hi Bill, If you *really* have time, then you can go deep into the log, and find out why configure failed. It looks like configure failed when it tried to compile this code: .text # .gsym_test_func .globl .gsym_test_func .gsym_test_func: # .gsym_test_func configure:26752: result: none conf

Re: [OMPI users] Can't read more than 2^31 bytes with MPI_File_read, regardless of type?

2012-08-07 Thread Rayson Ho

I originally thought that it was an issue related to 32-bit executables, but it seems to affect 64-bit as well... I found references to this problem -- it was reported back in 2007: http://lists.mcs.anl.gov/pipermail/mpich-discuss/2007-July/002600.html If you look at the code, you will find tha

Re: [OMPI users] issue with column type in language C

2012-08-19 Thread Rayson Ho

Hi Christian, The code you posted is very similar to another school assignment sent to this list 2 years ago: http://www.open-mpi.org/community/lists/users/2010/10/14619.php At that time, the code was written in Fortran, and now it is written in C - however, the variable names, logic, etc are qu

Re: [OMPI users] 2 GB limitation of MPI_File_write_all

2012-10-20 Thread Rayson Ho

Hi Eric, Sounds like it's also related to this problem reported by Scinet back in July: http://www.open-mpi.org/community/lists/users/2012/07/19762.php And I think I found the issue, but I still have not followed up with the ROMIO guys yet. And I was not sure if Scinet was waiting for the fix or

Re: [OMPI users] OpenMPI on Windows when MPI_F77 is used from a C application

2012-10-29 Thread Rayson Ho

Mathieu, Can you include the small C program you wrote?? Rayson == Open Grid Scheduler - The Official Open Source Grid Engine http://gridscheduler.sourceforge.net/ On Mon, Oct 29, 2012 at 12:08 PM, Damien wrote: > Mathieu, > > Where is the crash

Re: [OMPI users] configuration openMPI problem

2012-11-23 Thread Rayson Ho

If you read the log, you will find: ./configure: line 5373: icc: command not found configure:5382: $? = 127 configure:5371: icc -v >&5 Rayson == Open Grid Scheduler - The Official Open Source Grid Engine http://gridscheduler.sourceforge

Re: [OMPI users] EXTERNAL: Re: How is hwloc used by OpenMPI

2012-11-23 Thread Rayson Ho

On Thu, Nov 8, 2012 at 11:07 AM, Jeff Squyres wrote: > Correct. PLPA was a first attempt at a generic processor affinity solution. > hwloc is a 2nd generation, much Much MUCH better solution than PLPA (we > wholly killed PLPA > after the INRIA guys designed hwloc). Edwin, We ported OGS/Grid

Re: [OMPI users] configuration openMPI problem

2012-11-24 Thread Rayson Ho

In your shell, run: export PATH=$PATH And then rerun configure again with the original parameters again - it should find icc & ifort this time. Rayson == Open Grid Scheduler - The Official Open Source Grid Engine http://gridscheduler.sourceforge.n

Re: [OMPI users] configuration openMPI problem

2012-11-24 Thread Rayson Ho

, I use only ifort. > Now I have folder with OPT. If it works now and it is ok use only iFort what > can I do to learn? > I mean where can I find a good tutorial or hello project in fortran. I have > found something for c but nothing about fortran. > > Thanks again > > Diego >

Re: [OMPI users] [OMPI devel] processor affinity -- OpenMPI / batch system integration

2009-10-22 Thread Rayson Ho

- give us the list of cores available > to us so we can map and do affinity, and pass in your own mapping. Maybe > with some logic so we can decide which to use based on whether OMPI or GE > did the mapping?? > > Not sure here - just thinking out loud. > Ralph > > On Sep 30

Re: [OMPI users] [OMPI devel] processor affinity -- OpenMPI / batch system integration

2009-10-22 Thread Rayson Ho

Rayson On Thu, Oct 22, 2009 at 10:16 AM, Ralph Castain wrote: > Hi Rayson > > You're probably aware: starting with 1.3.4, OMPI will detect and abide by > external bindings. So if grid engine sets a binding, we'll follow it. > > Ralph > > On Oct 22, 2009, at

Re: [OMPI users] Poor performance on Amazon EC2 with TCP

2016-03-08 Thread Rayson Ho

If you are using instance types that support SR-IOV (aka. "enhanced networking" in AWS), then turn it on. We saw huge differences when SR-IOV is enabled http://blogs.scalablelogic.com/2013/12/enhanced-networking-in-aws-cloud.html http://blogs.scalablelogic.com/2014/01/enhanced-networking-in-aws-cl

Re: [OMPI users] Why do I need a C++ linker while linking in MPI C code with CUDA?

2016-03-21 Thread Rayson Ho

On Sun, Mar 20, 2016 at 10:37 PM, dpchoudh . wrote: > I'd tend to agree with Gilles. I have written CUDA programs in pure C > (i.e. neither involving MPI nor C++) and a pure C based tool chain builds > the code successfully. So I don't see why CUDA should be intrinsically C++. > nvcc calls the C

37 matches

Mail list logo