Re: [OMPI users] Open MPI via SSH noob issue

Reuti Tue, 9 Aug 2011 05:07:50 -0400

Hi,

Am 09.08.2011 um 08:46 schrieb Christopher Jones:


> I changed the subject of my previous posting to reflect a new problem 
> encountered when I changed my strategy to using SSH instead of Xgrid on two 
> mac pros. I've set up a login-less ssh communication between the two macs 
> (connected via direct ethernet, both running openmpi 1.2.8 on OSX 10.6.8) per 
> the instructions on the FAQ. I can type in 'ssh computer-name.local' on 
> either computer and connect without a password prompt. From what I can see, 
> the ssh-agent is up and running - the following is listed in my ENV:
> 
> SSH_AUTH_SOCK=/tmp/launch-5FoCc1/Listeners
> SSH_AGENT_PID=61058
> 
> My host file simply lists 'localhost' and 
> 'chrisjones2@allana-welshs-mac-pro.local'. When I run a simple hello_world 
> test, I get what seems like a reasonable output:
> 
> chris-joness-mac-pro:~ chrisjones$ mpirun -np 8 -hostfile hostfile 
> ./test_hello
> Hello world from process 0 of 8
> Hello world from process 1 of 8
> Hello world from process 2 of 8
> Hello world from process 3 of 8
> Hello world from process 4 of 8
> Hello world from process 7 of 8
> Hello world from process 5 of 8
> Hello world from process 6 of 8
> 
> I can also run hostname and get what seems to be an ok response (unless I'm 
> wrong about this):
> 
> chris-joness-mac-pro:~ chrisjones$ mpirun -np 8 -hostfile hostfile hostname
> allana-welshs-mac-pro.local
> allana-welshs-mac-pro.local
> allana-welshs-mac-pro.local
> allana-welshs-mac-pro.local
> quadcore.mikrob.slu.se
> quadcore.mikrob.slu.se
> quadcore.mikrob.slu.se
> quadcore.mikrob.slu.se
> 
> 
> However, when I run the ring_c test, it freezes:
> 
> chris-joness-mac-pro:~ chrisjones$ mpirun -np 8 -hostfile hostfile ./ring_c
> Process 0 sending 10 to 1, tag 201 (8 processes in ring)
> Process 0 sent to 1
> Process 0 decremented value: 9
> 
> (I noted that processors on both computers are active).
> 
> ring_c was compiled separately on each computer, however both have the same 
> version of openmpi and OSX. I've gone through the FAQ and searched the user 
> forum, but I can't quite seems to get this problem unstuck.

do you have any firewall on the machines?

-- Reuti

> 
> Many thanks for your time,
> Chris
> 
> On Aug 5, 2011, at 6:00 PM, <users-requ...@open-mpi.org> 
> <users-requ...@open-mpi.org> wrote:
> 
>> Send users mailing list submissions to
>>       us...@open-mpi.org
>> 
>> To subscribe or unsubscribe via the World Wide Web, visit
>>       http://www.open-mpi.org/mailman/listinfo.cgi/users
>> or, via email, send a message with subject or body 'help' to
>>       users-requ...@open-mpi.org
>> 
>> You can reach the person managing the list at
>>       users-ow...@open-mpi.org
>> 
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of users digest..."
>> 
>> 
>> Today's Topics:
>> 
>>  1. Re: OpenMPI causing WRF to crash (Jeff Squyres)
>>  2. Re: OpenMPI causing WRF to crash (Anthony Chan)
>>  3. Re: Program hangs on send when run with nodes on  remote
>>     machine (Jeff Squyres)
>>  4. Re: openmpi 1.2.8 on Xgrid noob issue (Jeff Squyres)
>>  5. Re: parallel I/O on 64-bit indexed arays (Rob Latham)
>> 
>> 
>> ----------------------------------------------------------------------
>> 
>> Message: 1
>> Date: Thu, 4 Aug 2011 19:18:36 -0400
>> From: Jeff Squyres <jsquy...@cisco.com>
>> Subject: Re: [OMPI users] OpenMPI causing WRF to crash
>> To: Open MPI Users <us...@open-mpi.org>
>> Message-ID: <3f0e661f-a74f-4e51-86c0-1f84feb07...@cisco.com>
>> Content-Type: text/plain; charset=windows-1252
>> 
>> Signal 15 is usually SIGTERM on Linux, meaning that some external entity 
>> probably killed the job.
>> 
>> The OMPI error message you describe is also typical for that kind of 
>> scenario -- i.e., a process exited without calling MPI_Finalize could mean 
>> that it called exit() or some external process killed it.
>> 
>> 
>> On Aug 3, 2011, at 7:24 AM, BasitAli Khan wrote:
>> 
>>> I am trying to run a rather heavy wrf simulation with spectral nudging but 
>>> the simulation crashes after 1.8 minutes of integration.
>>> The simulation has two domains    with  d01 = 601x601 and d02 = 721x721 and 
>>> 51 vertical levels. I tried this simulation on two different systems but 
>>> result was more or less same. For example
>>> 
>>> On our Bluegene/P  with SUSE Linux Enterprise Server 10 ppc and XLF 
>>> compiler I tried to run wrf on 2048 shared memory nodes (1 compute node = 4 
>>> cores , 32 bit, 850 Mhz). For the parallel run I used mpixlc, mpixlcxx and 
>>> mpixlf90.  I got the following error message in the wrf.err file
>>> 
>>> <Aug 01 19:50:21.244540> BE_MPI (ERROR): The error message in the job
>>> record is as follows:
>>> <Aug 01 19:50:21.244657> BE_MPI (ERROR):   "killed with signal 15"
>>> 
>>> I also tried to run the same simulation on our linux cluster (Linux Red Hat 
>>> Enterprise 5.4m  x86_64 and Intel compiler) with 8, 16 and 64 nodes (1 
>>> compute node=8 cores). For the parallel run I am used 
>>> mpi/openmpi/1.4.2-intel-11. I got the following error message in the error 
>>> log after couple of minutes of integration.
>>> 
>>> "mpirun has exited due to process rank 45 with PID 19540 on
>>> node ci118 exiting without calling "finalize". This may
>>> have caused other processes in the application to be
>>> terminated by signals sent by mpirun (as reported here)."
>>> 
>>> I tried many things but nothing seems to be working. However, if I reduce  
>>> grid points below 200, the simulation goes fine. It appears that probably 
>>> OpenMP has problem with large number of grid points but I have no idea how 
>>> to fix it. I will greatly appreciate if you could suggest some solution.
>>> 
>>> Best regards,
>>> ---
>>> Basit A. Khan, Ph.D.
>>> Postdoctoral Fellow
>>> Division of Physical Sciences & Engineering
>>> Office# 3204, Level 3, Building 1,
>>> King Abdullah University of Science & Technology
>>> 4700 King Abdullah Blvd, Box 2753, Thuwal 23955 ?6900,
>>> Kingdom of Saudi Arabia.
>>> 
>>> Office: +966(0)2 808 0276,  Mobile: +966(0)5 9538 7592
>>> E-mail: basitali.k...@kaust.edu.sa
>>> Skype name: basit.a.khan
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> 
>> 
>> 
>> ------------------------------
>> 
>> Message: 2
>> Date: Thu, 4 Aug 2011 18:59:59 -0500 (CDT)
>> From: Anthony Chan <c...@mcs.anl.gov>
>> Subject: Re: [OMPI users] OpenMPI causing WRF to crash
>> To: Open MPI Users <us...@open-mpi.org>
>> Message-ID:
>>       <660521091.191111.1312502399225.javamail.r...@zimbra.anl.gov>
>> Content-Type: text/plain; charset=utf-8
>> 
>> 
>> If you want to debug this on BGP, you could set BG_COREDUMPONERROR=1
>> and look at the backtrace in the light weight core files
>> (you probably need to recompile everything with -g).
>> 
>> A.Chan
>> 
>> ----- Original Message -----
>>> Hi Dmitry,
>>> Thanks for a prompt and fairly detailed response. I have also
>>> forwarded
>>> the email to wrf community in the hope that somebody would have some
>>> straight forward solution. I will try to debug the error as suggested
>>> by
>>> you if I would not have much luck from the wrf forum.
>>> 
>>> Cheers,
>>> ---
>>> 
>>> Basit A. Khan, Ph.D.
>>> Postdoctoral Fellow
>>> Division of Physical Sciences & Engineering
>>> Office# 3204, Level 3, Building 1,
>>> King Abdullah University of Science & Technology
>>> 4700 King Abdullah Blvd, Box 2753, Thuwal 23955 ?6900,
>>> Kingdom of Saudi Arabia.
>>> 
>>> Office: +966(0)2 808 0276, Mobile: +966(0)5 9538 7592
>>> E-mail: basitali.k...@kaust.edu.sa
>>> Skype name: basit.a.khan
>>> 
>>> 
>>> 
>>> 
>>> On 8/3/11 2:46 PM, "Dmitry N. Mikushin" <maemar...@gmail.com> wrote:
>>> 
>>>> 5 apparently means one of the WRF's MPI processes has been
>>>> unexpectedly terminated, maybe by program decision. No matter, if it
>>>> is OpenMPI-specifi
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> 
>> ------------------------------
>> 
>> Message: 3
>> Date: Thu, 4 Aug 2011 20:46:16 -0400
>> From: Jeff Squyres <jsquy...@cisco.com>
>> Subject: Re: [OMPI users] Program hangs on send when run with nodes on
>>       remote machine
>> To: Open MPI Users <us...@open-mpi.org>
>> Message-ID: <f344f301-ad7b-4e83-b0df-a6e001072...@cisco.com>
>> Content-Type: text/plain; charset=us-ascii
>> 
>> I notice that in the worker, you have:
>> 
>> eth2      Link encap:Ethernet  HWaddr 00:1b:21:77:c5:d4
>>         inet addr:192.168.1.155  Bcast:192.168.1.255  Mask:255.255.255.0
>>         inet6 addr: fe80::21b:21ff:fe77:c5d4/64 Scope:Link
>>         UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>         RX packets:9225846 errors:0 dropped:75175 overruns:0 frame:0
>>         TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
>>         collisions:0 txqueuelen:1000
>>         RX bytes:1336628768 (1.3 GB)  TX bytes:552 (552.0 B)
>> 
>> eth3      Link encap:Ethernet  HWaddr 00:1b:21:77:c5:d5
>>         inet addr:192.168.1.156  Bcast:192.168.1.255  Mask:255.255.255.0
>>         inet6 addr: fe80::21b:21ff:fe77:c5d5/64 Scope:Link
>>         UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>         RX packets:26481809 errors:0 dropped:75059 overruns:0 frame:0
>>         TX packets:18030236 errors:0 dropped:0 overruns:0 carrier:0
>>         collisions:0 txqueuelen:1000
>>         RX bytes:70061260271 (70.0 GB)  TX bytes:11844181778 (11.8 GB)
>> 
>> Two different NICs are on the same subnet -- that doesn't seem like a good 
>> idea...?  I think this topic has come up on the users list before, and, 
>> IIRC, the general consensus is "don't do that" because it's not clear as to 
>> which NIC Linux will actually send outgoing traffic across bound for the 
>> 192.168.1.x subnet.
>> 
>> 
>> 
>> On Aug 4, 2011, at 1:59 PM, Keith Manville wrote:
>> 
>>> I am having trouble running my MPI program on multiple nodes. I can
>>> run multiple processes on a single node, and I can spawn processes on
>>> on remote nodes, but when I call Send from a remote node, the node
>>> never returns, even though there is an appropriate Recv waiting. I'm
>>> pretty sure this is an issue with my configuration, not my code. I've
>>> tried some other sample programs I found and had the same problem of
>>> hanging on a send from one host to another.
>>> 
>>> Here's an in depth description:
>>> 
>>> I wrote a quick test program where each process with rank > 1 sends an
>>> int to the master (rank 0), and the master receives until it gets
>>> something from every other process.
>>> 
>>> My test program works fine when I run multiple processes on a single 
>>> machine.
>>> 
>>> either the local node:
>>> 
>>> $ ./mpirun -n 4 ./mpi-test
>>> Hi I'm localhost:2
>>> Hi I'm localhost:1
>>> localhost:1 sending 11...
>>> localhost:2 sending 12...
>>> localhost:2 sent 12
>>> localhost:1 sent 11
>>> Hi I'm localhost:0
>>> localhost:0 received 11 from 1
>>> localhost:0 received 12 from 2
>>> Hi I'm localhost:3
>>> localhost:3 sending 13...
>>> localhost:3 sent 13
>>> localhost:0 received 13 from 3
>>> all workers checked in!
>>> 
>>> or a remote one:
>>> 
>>> $ ./mpirun -np 2 -host remotehost ./mpi-test
>>> Hi I'm remotehost:0
>>> remotehost:0 received 11 from 1
>>> all workers checked in!
>>> Hi I'm remotehost:1
>>> remotehost:1 sending 11...
>>> remotehost:1 sent 11
>>> 
>>> But when I try to run the master locally and the worker(s) remotely
>>> (this is the way I am actually interested in running it), Send never
>>> returns and it hangs indefinitely.
>>> 
>>> $ ./mpirun -np 2 -host localhost,remotehost ./mpi-test
>>> Hi I'm localhost:0
>>> Hi I'm remotehost:1
>>> remotehost:1 sending 11...
>>> 
>>> Just to see if it would work, I tried spawning the master on the
>>> remotehost and the worker on the localhost.
>>> 
>>> $ ./mpirun -np 2 -host remotehost,localhost ./mpi-test
>>> Hi I'm localhost:1
>>> localhost:1 sending 11...
>>> localhost:1 sent 11
>>> Hi I'm remotehost:0
>>> remotehost:0 received 0 from 1
>>> all workers checked in!
>>> 
>>> It doesn't hang on Send, but the wrong value is received.
>>> 
>>> Any idea what's going on? I've attached my code, my config.log,
>>> ifconfig output, and ompi_info output.
>>> 
>>> Thanks,
>>> Keith
>>> <mpi.tgz>_______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> 
>> 
>> 
>> ------------------------------
>> 
>> Message: 4
>> Date: Thu, 4 Aug 2011 20:48:30 -0400
>> From: Jeff Squyres <jsquy...@cisco.com>
>> Subject: Re: [OMPI users] openmpi 1.2.8 on Xgrid noob issue
>> To: Open MPI Users <us...@open-mpi.org>
>> Message-ID: <c2ea7fd0-badb-4d05-851c-c444be26f...@cisco.com>
>> Content-Type: text/plain; charset=us-ascii
>> 
>> I'm afraid our Xgrid support has lagged, and Apple hasn't show much interest 
>> in MPI + Xgrid support -- much less HPC.  :-\
>> 
>> Have you see the FAQ items about Xgrid?
>> 
>>   http://www.open-mpi.org/faq/?category=osx#xgrid-howto
>> 
>> 
>> On Aug 4, 2011, at 4:16 AM, Christopher Jones wrote:
>> 
>>> Hi there,
>>> 
>>> I'm currently trying to set up a small xgrid between two mac pros (a single 
>>> quadcore and a 2 duo core), where both are directly connected via an 
>>> ethernet cable. I've set up xgrid using the password authentication (rather 
>>> than the kerberos), and from what I can tell in the Xgrid admin tool it 
>>> seems to be working. However, once I try a simple hello world program, I 
>>> get this error:
>>> 
>>> chris-joness-mac-pro:~ chrisjones$ mpirun -np 4 ./test_hello
>>> mpirun noticed that job rank 0 with PID 381 on node xgrid-node-0 exited on 
>>> signal 15 (Terminated).
>>> 1 additional process aborted (not shown)
>>> 2011-08-04 10:02:16.329 mpirun[350:903] *** Terminating app due to uncaught 
>>> exception 'NSInvalidArgumentException', reason: '*** 
>>> -[NSKVONotifying_XGConnection<0x1001325a0> finalize]: called when 
>>> collecting not enabled'
>>> *** Call stack at first throw:
>>> (
>>>     0   CoreFoundation                      0x00007fff814237b4 
>>> __exceptionPreprocess + 180
>>>     1   libobjc.A.dylib                     0x00007fff84fe8f03 
>>> objc_exception_throw + 45
>>>     2   CoreFoundation                      0x00007fff8143e631 
>>> -[NSObject(NSObject) finalize] + 129
>>>     3   mca_pls_xgrid.so                    0x00000001002a9ce3 
>>> -[PlsXGridClient dealloc] + 419
>>>     4   mca_pls_xgrid.so                    0x00000001002a9837 
>>> orte_pls_xgrid_finalize + 40
>>>     5   libopen-rte.0.dylib                 0x000000010002d0f9 
>>> orte_pls_base_close + 249
>>>     6   libopen-rte.0.dylib                 0x0000000100012027 
>>> orte_system_finalize + 119
>>>     7   libopen-rte.0.dylib                 0x000000010000e968 
>>> orte_finalize + 40
>>>     8   mpirun                              0x00000001000011ff orterun + 
>>> 2042
>>>     9   mpirun                              0x0000000100000a03 main + 27
>>>     10  mpirun                              0x00000001000009e0 start + 52
>>>     11  ???                                 0x0000000000000004 0x0 + 4
>>> )
>>> terminate called after throwing an instance of 'NSException'
>>> [chris-joness-mac-pro:00350] *** Process received signal ***
>>> [chris-joness-mac-pro:00350] Signal: Abort trap (6)
>>> [chris-joness-mac-pro:00350] Signal code:  (0)
>>> [chris-joness-mac-pro:00350] [ 0] 2   libSystem.B.dylib                   
>>> 0x00007fff81ca51ba _sigtramp + 26
>>> [chris-joness-mac-pro:00350] [ 1] 3   ???                                 
>>> 0x00000001000cd400 0x0 + 4295808000
>>> [chris-joness-mac-pro:00350] [ 2] 4   libstdc++.6.dylib                   
>>> 0x00007fff830965d2 __tcf_0 + 0
>>> [chris-joness-mac-pro:00350] [ 3] 5   libobjc.A.dylib                     
>>> 0x00007fff84fecb39 _objc_terminate + 100
>>> [chris-joness-mac-pro:00350] [ 4] 6   libstdc++.6.dylib                   
>>> 0x00007fff83094ae1 _ZN10__cxxabiv111__terminateEPFvvE + 11
>>> [chris-joness-mac-pro:00350] [ 5] 7   libstdc++.6.dylib                   
>>> 0x00007fff83094b16 _ZN10__cxxabiv112__unexpectedEPFvvE + 0
>>> [chris-joness-mac-pro:00350] [ 6] 8   libstdc++.6.dylib                   
>>> 0x00007fff83094bfc 
>>> _ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception + 0
>>> [chris-joness-mac-pro:00350] [ 7] 9   libobjc.A.dylib                     
>>> 0x00007fff84fe8fa2 object_getIvar + 0
>>> [chris-joness-mac-pro:00350] [ 8] 10  CoreFoundation                      
>>> 0x00007fff8143e631 -[NSObject(NSObject) finalize] + 129
>>> [chris-joness-mac-pro:00350] [ 9] 11  mca_pls_xgrid.so                    
>>> 0x00000001002a9ce3 -[PlsXGridClient dealloc] + 419
>>> [chris-joness-mac-pro:00350] [10] 12  mca_pls_xgrid.so                    
>>> 0x00000001002a9837 orte_pls_xgrid_finalize + 40
>>> [chris-joness-mac-pro:00350] [11] 13  libopen-rte.0.dylib                 
>>> 0x000000010002d0f9 orte_pls_base_close + 249
>>> [chris-joness-mac-pro:00350] [12] 14  libopen-rte.0.dylib                 
>>> 0x0000000100012027 orte_system_finalize + 119
>>> [chris-joness-mac-pro:00350] [13] 15  libopen-rte.0.dylib                 
>>> 0x000000010000e968 orte_finalize + 40
>>> [chris-joness-mac-pro:00350] [14] 16  mpirun                              
>>> 0x00000001000011ff orterun + 2042
>>> [chris-joness-mac-pro:00350] [15] 17  mpirun                              
>>> 0x0000000100000a03 main + 27
>>> [chris-joness-mac-pro:00350] [16] 18  mpirun                              
>>> 0x00000001000009e0 start + 52
>>> [chris-joness-mac-pro:00350] [17] 19  ???                                 
>>> 0x0000000000000004 0x0 + 4
>>> [chris-joness-mac-pro:00350] *** End of error message ***
>>> Abort trap
>>> 
>>> 
>>> I've seen this error in a previous mailing, and it seems that the issue has 
>>> something to do with forcing everything to use kerberos (SSO). However, I 
>>> noticed that in the computer being used as an agent, this option is grayed 
>>> on in the Xgrid sharing configuration (I have no idea why). I would 
>>> therefore ask if it is absolutely necessary to use SSO to get openmpi to 
>>> run with xgrid, or am I missing something with the password setup. Seems 
>>> that the kerberos option is much more complicated, and I may even want to 
>>> switch to just using openmpi with ssh.
>>> 
>>> Many thanks,
>>> Chris
>>> 
>>> 
>>> Chris Jones
>>> Post-doctoral Research Assistant,
>>> 
>>> Department of Microbiology
>>> Swedish University of Agricultural Sciences
>>> Uppsala, Sweden
>>> phone: +46 (0)18 67 3222
>>> email: chris.jo...@slu.se
>>> 
>>> Department of Soil and Environmental Microbiology
>>> National Institute for Agronomic Research
>>> Dijon, France
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> 
>> 
>> 
>> ------------------------------
>> 
>> Message: 5
>> Date: Fri, 5 Aug 2011 08:41:58 -0500
>> From: Rob Latham <r...@mcs.anl.gov>
>> Subject: Re: [OMPI users] parallel I/O on 64-bit indexed arays
>> To: Open MPI Users <us...@open-mpi.org>
>> Cc: Quincey Koziol <koz...@hdfgroup.org>, Fab Tillier
>>       <ftill...@microsoft.com>
>> Message-ID: <20110805134158.ga28...@mcs.anl.gov>
>> Content-Type: text/plain; charset=us-ascii
>> 
>> On Wed, Jul 27, 2011 at 06:13:05PM +0200, Troels Haugboelle wrote:
>>> and we get good (+GB/s) performance when writing files from large runs.
>>> 
>>> Interestingly, an alternative and conceptually simpler option is to
>>> use MPI_FILE_WRITE_ORDERED, but the performance of that function on
>>> Blue-Gene/P sucks - 20 MB/s instead of GB/s. I do not know why.
>> 
>> Ordered mode as implemented in ROMIO is awful.  Entirely serialized.
>> We pass a token from process to process. Each process acquires the
>> token, updates the shared file pointer, does its i/o, then passes the
>> token to the next process.
>> 
>> What we should do, and have done in test branches [1], is use MPI_SCAN
>> to look at the shared file pointer once, tell all the processors their
>> offset, then update the shared file pointer while all processes do I/O
>> in parallel.
>> 
>> [1]: Robert Latham, Robert Ross, and Rajeev Thakur. "Implementing
>> MPI-IO Atomic Mode and Shared File Pointers Using MPI One-Sided
>> Communication". International Journal of High Performance Computing
>> Applications, 21(2):132-143, 2007
>> 
>> Since no one uses the shared file pointers, and even fewer people use
>> ordered mode, we just haven't seen the need to do so.
>> 
>> Do you want to rebuild your MPI library on BlueGene?  I can pretty
>> quickly generate and send a patch that will make ordered mode go whip
>> fast.
>> 
>> ==rob
>> 
>>> 
>>> Troels
>>> 
>>> On 6/7/11 15:04 , Jeff Squyres wrote:
>>>> On Jun 7, 2011, at 4:53 AM, Troels Haugboelle wrote:
>>>> 
>>>>> In principle yes, but the problem is we have an unequal amount of 
>>>>> particles on each node, so the length of each array is not guaranteed to 
>>>>> be divisible by 2, 4 or any other number. If I have understood the 
>>>>> definition of MPI_TYPE_CREATE_SUBARRAY correctly the offset can be 
>>>>> 64-bit, but not the global array size, so, optimally, what I am looking 
>>>>> for is something that has unequal size for each thread, simple vector, 
>>>>> and with 64-bit offsets and global array size.
>>>> It's a bit awkward, but you can still make datatypes to give the offset 
>>>> that you want.  E.g., if you need an offset of 2B+31 bytes, you can make 
>>>> datatype A with type contig of N=(2B/sizeof(int)) int's.  Then make 
>>>> datatype B with type struct, containing type A and 31 MPI_BYTEs.  Then use 
>>>> 1 instance of datatype B to get the offset that you want.
>>>> 
>>>> You could make utility functions that, given a specific (64 bit) offset, 
>>>> it makes an MPI datatype that matches the offset, and then frees it (and 
>>>> all sub-datatypes).
>>>> 
>>>> There is a bit of overhead in creating these datatypes, but it should be 
>>>> dwarfed by the amount of data that you're reading/writing, right?
>>>> 
>>>> It's awkward, but it should work.
>>>> 
>>>>> Another possible workaround would be to identify subsections that do not 
>>>>> pass 2B elements, make sub communicators, and then let each of them dump 
>>>>> their elements with proper offsets. It may work. The problematic 
>>>>> architecture is a BG/P. On other clusters doing simple I/O, letting all 
>>>>> threads open the file, seek to their position, and then write their chunk 
>>>>> works fine, but somehow on BG/P performance drops dramatically. My guess 
>>>>> is that there is some file locking, or we are overwhelming the I/O nodes..
>>>>> 
>>>>>> This ticket for the MPI-3 standard is a first step in the right 
>>>>>> direction, but won't do everything you need (this is more FYI):
>>>>>> 
>>>>>>   https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/265
>>>>>> 
>>>>>> See the PDF attached to the ticket; it's going up for a "first reading" 
>>>>>> in a month.  It'll hopefully be part of the MPI-3 standard by the end of 
>>>>>> the year (Fab Tillier, CC'ed, has been the chief proponent of this 
>>>>>> ticket for the past several months).
>>>>>> 
>>>>>> Quincey Koziol from the HDF group is going to propose a follow on to 
>>>>>> this ticket, specifically about the case you're referring to -- large 
>>>>>> counts for file functions and datatype constructors.  Quincey -- can you 
>>>>>> expand on what you'll be proposing, perchance?
>>>>> Interesting, I think something along the lines of the note would be very 
>>>>> useful and needed for large applications.
>>>>> 
>>>>> Thanks a lot for the pointers and your suggestions,
>>>>> 
>>>>> cheers,
>>>>> 
>>>>> Troels
>>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> --
>> Rob Latham
>> Mathematics and Computer Science Division
>> Argonne National Lab, IL USA
>> 
>> 
>> ------------------------------
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> End of users Digest, Vol 1977, Issue 1
>> **************************************
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Open MPI via SSH noob issue

Reply via email to