[OMPI users] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file util/show_help.c at line 501; error in device init Mesh created.

2023-05-19 Thread Rob Kudyba via users
RHEL 8 with OpenMPI 4.1.5a1 on a HPC cluster compute node Singularity version 3.7.1. I see the error in another issue mentioned at the Git page an on SO

Re: [OMPI users] Open MPI 4.0.3 outside as well as inside a SimpleFOAM container: step creation temporarily disabled, retrying Requested nodes are busy

2023-03-01 Thread Rob Kudyba via users
/openfoam10/etc/bashrc&&simpleFoam -parallel' There are probably other ways to get this to work but the above did the trick. Thanks for the suggestion. Rob >

[OMPI users] Open MPI 4.0.3 outside as well as inside a SimpleFOAM container: step creation temporarily disabled, retrying Requested nodes are busy

2023-02-28 Thread Rob Kudyba via users
ttached restart_syscall(<... resuming interrupted poll ...>^Cstrace: Process 11650 detached With or without the --exclusive option all I get is: srun: Job 12525169 step creation temporarily disabled, retrying (Requested nodes are busy) srun: Job 12525169 step creation temporarily disabled, retrying (Requested nodes are busy) Are the options not in the correct order? Thanks, Rob

[OMPI users] --mca parameter explainer; mpirun WARNING: There was an error initializing an OpenFabrics device

2022-09-22 Thread Rob Kudyba via users
We're using OpenMPI 4.1.1, CUDA aware on RHEL 8 cluster that we load as a module with Infiniband controller Mellanox Technologies MT28908 Family ConnectX-6, we see this warning runnig mpirun without any MCA options/parameters: WARNING: There was an error initializing an OpenFabrics device. Local

[OMPI users] kernel trap - divide by zero

2020-04-29 Thread Rob Scott (roscott2) via users
We are seeing a kernel trap in Hwloc being reported from a few customers. In one particular case, here are details. hwloc-1.10.1 Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz The offending code is is in look_proc() due to cupid function 0x1 returning 4 logical processors or possibly hwloc_flsl() ma

Re: [OMPI users] MPI_File_read+MPI_BOTTOM crash on NFS ?

2016-06-22 Thread Rob Latham
-- clang won't warn about that, assuming we know what we are doing (hah!) ==rob

Re: [OMPI users] Docker Cluster Queue Manager

2016-06-22 Thread Rob Nagler
> issue of running MPI jobs with it, though. > > I don't see how Singularity addresses the problem of starting MPI inside Docker. In any event, our current plan is to bypass resource managers completely and start an AWS fleet per user request. The code is much simpler for everybody. Rob

Re: [OMPI users] Regression in MPI_File_close?!

2016-06-07 Thread Rob Latham
nters. If we did not have this stupid hidden file with the shared file pointer offeset to worry about, close becomes a lot more simple. It's just that we open from rank 0 the shared file pointer with DELETE_ON_CLOSE. ==rob THanks EDga On 5/31/2016 9:33 PM, Gilles Gouaillardet wrote:

Re: [OMPI users] Docker Cluster Queue Manager

2016-06-06 Thread Rob Nagler
blems. They are running jobs on behalf of web users (like our problem) as a single unix user id. Docker Hub runs containers as root to build images so they must be able to lock down containers well enough. Another thing we can (and probably should) do is verify the images have no setuid files. I think this would eliminate a lot of the privilege escalation issues. Rob

Re: [OMPI users] Docker Cluster Queue Manager

2016-06-06 Thread Rob Nagler
'll play with Slurm Elastic Compute this week to see how it works. Rob

Re: [OMPI users] Docker Cluster Queue Manager

2016-06-04 Thread Rob Nagler
s described above. Even being able to list all the other users on a system (via "ls /home") is a privacy breach in a web app. Rob

Re: [OMPI users] Docker Cluster Queue Manager

2016-06-04 Thread Rob Nagler
olve the problem of queuing dynamic clusters. SLURM/Torque, which Shifter relies on, does not either. This is probably the most difficult item. StarCluster does solve this problem, but doesn't work on bare metal, and it's not clear if it is being maintained any more. Rob

Re: [OMPI users] Docker Cluster Queue Manager

2016-06-03 Thread Rob Nagler
echanism. JupyterHub is a fine front-end for what we want to do. All we need is a qsub that is decouple from Unix user ids and allows for the creation of clusters dynamically. Rob

Re: [OMPI users] Docker Cluster Queue Manager

2016-06-03 Thread Rob Nagler
and users can package their code portably. The "module load" systems like Bright Cluster offers are irrelevant. Let users build their images as they like with only a few requirements, and they can run them with JupyterHub AND in an HPC environment, which eliminates the need for Singularity. Rob

Re: [OMPI users] Docker Cluster Queue Manager

2016-06-02 Thread Rob Nagler
clusters and users who have no local credentials (non-Unix user -- just a name and a home directory). TIA, Rob

[OMPI users] Docker Cluster Queue Manager

2016-06-02 Thread Rob Nagler
iasoft Early phase wiki: https://github.com/radiasoft/devops/wiki/DockerMPI Thanks, Rob

[OMPI users] Building vs packaging

2016-05-14 Thread Rob Malpass
Hi all I posted about a fortnight ago to this list as I was having some trouble getting my nodes to be controlled by my master node. Perceived wisdom at the time was to compile with the -enable-orterun-prefix-by-default. For some time I'd been getting cannot open libopen-rte.so.7 which poin

[OMPI users] Ubuntu and LD_LIBRARY_PATH

2016-04-25 Thread Rob Malpass
Hi Sorry if this isn't 100% relevant to this list but I'm at my wits end. After a lot of hacking, I've finally configured openmpi on my Ubuntu cluster. I had been having awful problems with not being able to find the libraries on the remote nodes but apparently the workaround is to use ld.

Re: [OMPI users] Error with MPI_Register_datarep

2016-03-14 Thread Rob Latham
ovide not only the sort of platform portability Eric desires, but also provide a self-describing file format. ==rob George. On Mar 12, 2016, at 22:26 , Éric Chamberland mailto:eric.chamberl...@giref.ulaval.ca>> wrote: ERROR Returned by MPI: 51 ERROR_string Retu

Re: [OMPI users] PVFS/OrangeFS (was: cleaning up old ROMIO (MPI-IO) drivers)

2016-02-12 Thread Rob Latham
On 01/26/2016 09:32 AM, Dave Love wrote: Rob Latham writes: We didn't need to deploy PLFS at Argonne: GPFS handled writing N-to-1 files just fine (once you line up the block sizes), so I'm beholden to PLFS communities for ROMIO support. I guess GPFS has improved in that res

Re: [OMPI users] cleaning up old ROMIO (MPI-IO) drivers

2016-02-12 Thread Rob Latham
cluding it, assuming it's still worthwhile with Lustre in particular. PLFS folks seem to think it's not worth the effort? https://github.com/plfs/plfs-core/issues/361 ==rob

Re: [OMPI users] error openmpi check hdf5

2016-02-11 Thread Rob Latham
! ==rob

Re: [OMPI users] MX replacement?

2016-02-11 Thread Rob Latham
ta to/from file system). not sure if that's what you were asking... ==rob

Re: [OMPI users] cleaning up old ROMIO (MPI-IO) drivers

2016-01-25 Thread Rob Latham
On 01/21/2016 05:59 AM, Dave Love wrote: [Catching up...] Rob Latham writes: Do you use any of the other ROMIO file system drivers? If you don't know if you do, or don't know what a ROMIO file system driver is, then it's unlikely you are using one. What if you use a driv

Re: [OMPI users] [mpich-discuss] cleaning up old ROMIO (MPI-IO) drivers

2016-01-05 Thread Rob Latham
On 01/05/2016 11:43 AM, Gus Correa wrote: Hi Rob Your email says you'll keep PVFS2. However, on your blog PVFS2 is not mentioned (on the "Keep" list). I suppose it will be kept, right? Right. An oversight on my part. PVFS2 will stay. ==rob Thank you, Gus Correa On 01/0

[OMPI users] cleaning up old ROMIO (MPI-IO) drivers

2016-01-05 Thread Rob Latham
river and it's not on the list? First off, let me know and I will probably want to visit your site and take a picture of your system. Then, let me know how much longer you foresee using the driver and we'll create a "deprecated" list for N more years. Thanks ==rob -

Re: [OMPI users] GPUDirect with OpenMPI

2015-03-09 Thread Aulwes, Rob
Hi Rolf, Sorry, this email got sorted into a separate folder and I missed it. I'll try it out this week. Thanks! Rob From: Rolf vandeVaart mailto:rvandeva...@nvidia.com>> Reply-To: Open MPI Users mailto:us...@open-mpi.org>> List-Post: users@lists.open-mpi.org Date: Tue, 3

Re: [OMPI users] LAM/MPI -> OpenMPI

2015-02-27 Thread Rob Latham
On 02/27/2015 11:14 AM, Jeff Squyres (jsquyres) wrote: Well, perhaps it was time. We haven't changed anything about LAM/MPI in ...a decade? Now that the domain is gone, since I don't even have an SVN checkout any more, I can't check when the last meaningful commit was. I

Re: [OMPI users] LAM/MPI -> OpenMPI

2015-02-27 Thread Rob Latham
PI-IO implementation, was this one: r10377 | brbarret | 2007-07-02 21:53:06 so you're missing out on 8 years of I/O related bug fixes and optimizations. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] MPIIO and OrangeFS

2015-02-25 Thread Rob Latham
n many situations. Edgar has drawn the build error of OMPI-master to my attention. I'll get that fixed straightaway. ==rob At OrangeFS documentation (http://docs.orangefs.com/v_2_8_8/index.htm) is chapter about using ROMIO, and it says, that i shoud compile apps with -lpvfs2. I have tryed

Re: [OMPI users] MPIIO and OrangeFS

2015-02-24 Thread Rob Latham
flags="--with-file-system=pvfs2+ufs+nfs" I'm not sure how OMPIO takes flags. If pvfs2-ping and pvfs2-cp and pvfs2-ls work, then you can bypass the kernel. also, please check return codes: http://stackoverflow.com/questions/22859269/what-do-mpi-io-error-codes-mean/26373193#26373

[OMPI users] GPUDirect with OpenMPI

2015-02-11 Thread Aulwes, Rob
receive the correct value from one of the neighbors. The code was compiled using PGI 14.7: mpif90 -o direct.x -acc acc_direct.f90 and executed with: mpirun -np 4 -npernode 2 -mca btl_openib_want_cudagdr 1 ./direct.x Does anyone know if I'm missing something when using GPUDirect? Thanks,Rob A

Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2015-01-14 Thread Rob Latham
//git.mpich.org/mpich.git/commit/a30a4721a2 ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2015-01-12 Thread Rob Latham
hat code to use PATH_MAX, not 256, which would have fixed the specific problem you encountered (and might have been sufficient to get us 10 more years, at which point someone might try to create a file with 1000 characters in it) ==rob Thanks for helping! Eric On 12/15/2014 11:42 PM,

Re: [OMPI users] mpi_file_read and arrays of custom datatypes

2014-12-01 Thread Rob Latham
scrutable -- and I like C-style 6 character variables a lot! ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] File locking in ADIO, OpenMPI 1.6.4

2014-09-18 Thread Rob Latham
On 09/18/2014 04:56 PM, Beichuan Yan wrote: Rob, Thank you very much for the suggestion. There are two independent scenarios using parallel IO in my code: 1. MPI processes conditionally print, i.e., some processes print in current loop (but may not print in next loop), some processes do

Re: [OMPI users] File locking in ADIO, OpenMPI 1.6.4

2014-09-18 Thread Rob Latham
On 09/17/2014 05:46 PM, Beichuan Yan wrote: Hi Rob, As you pointed out in April that there are many cases that could arouse ADIOI_Set_lock error. My code writes to a file at a location specified by a shared file pointer (it is a blocking and collective call): MPI_File_write_ordered

Re: [OMPI users] Runtime replacement of mpi libraries?

2014-09-11 Thread Rob Latham
recent versions, take a binary for Intel-MPI, IBM-MPI, and MPICH and use different libraries. There's almost no hope of being able to take MPICH and swap it out for OpenMPI. MorphMPI might get you a bit of the way, but there are some restrictions on what you can and cannot do with that.

Re: [OMPI users] Best way to communicate a 2d array with Java binding

2014-08-22 Thread Rob Latham
construct an HINDEXED type (or with very new MPICH, HINDEXED_BLOCK) and send that instead of copying. ==rob On Fri, Aug 22, 2014 at 3:38 PM, Rob Latham mailto:r...@mcs.anl.gov>> wrote: On 08/22/2014 10:10 AM, Saliya Ekanayake wrote: Hi, I've a quick questi

Re: [OMPI users] Best way to communicate a 2d array with Java binding

2014-08-22 Thread Rob Latham
n and send it 2. Copy values to a 1D array of size m*n and send it i have no idea about the java mpi bindings, but can you describe the type with an mpi datatype? ==rob I guess 2 would internally do the copying to a buffer and use it, so suggesting 1. is the best option. Is this the case or

Re: [OMPI users] MPI-I/O issues

2014-08-11 Thread Rob Latham
ion's type-inquiry routines. *I* don't care how OpenMPI deals with UB and LB. It was *you* who suggested one might need to look a bit more closely at how OpenMPI type processing handles those markers: http://www.open-mpi.org/community/lists/users/2014/05/24325.php ==rob Ge

Re: [OMPI users] MPI-I/O issues

2014-08-11 Thread Rob Latham
On 08/10/2014 07:32 PM, Mohamad Chaarawi wrote: Update: George suggested that I try with the 1.8.2 rc3 and that one resolves the hindexed_block segfault that I was seeing with ompi. the I/O part now works with ompio, but needs the patches from Rob in ROMIO to work correctly. The 2nd issue

Re: [OMPI users] MPI-I/O issues

2014-08-06 Thread Rob Latham
/commit/97114ec5b - http://git.mpich.org/mpich.git/commit/90e15e9b0 - http://git.mpich.org/mpich.git/commit/76a079c7c ... and two more patches that are sitting in my tree waiting review. ==rob [jam:15566] [ 2] /scr/chaarawi/install/ompi/lib/libmpi.so.1(ADIO_Set_view+0x1c1)[0xc72a6d] [jam

Re: [OMPI users] MPI-I/O issues

2014-08-06 Thread Rob Latham
mio resync. You are on your own with ompio! ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] Using PLFS with Open MPI 1.8

2014-07-28 Thread Rob Latham
OpenMPI to pick up all the bits... As with Lustre, I don't have access to a PLFS system and would welcome community contributions to integrate and test PLFS into ROMIO. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] MPIIO and derived data types

2014-07-21 Thread Rob Latham
c)) which (if I am reading fortran correctly) is a contiguous chunk of memory. If instead you had a more elaborate data structure, like a mesh of some kind, then passing an indexed type to the read call might make more sense. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] latest stable and win7/msvc2013

2014-07-17 Thread Rob Latham
on Microsoft's intentions regarding MPI and C99/C11 (just dreaming now). hey, (almost all of) c99 support is in place in visual studio 2013 http://blogs.msdn.com/b/vcblog/archive/2013/07/19/c99-library-support-in-visual-studio-2013.aspx ==rob On 2014-07-17 11:42 AM, Jed Brown wrote: Rob Lath

Re: [OMPI users] latest stable and win7/msvc2013

2014-07-17 Thread Rob Latham
s support every month -- probably more often. There's clearly a community of Windows users out there looking for a free MPI implementation. Sorry to hear that MS-MPI is not yet MPI-3 compliant. There is also Intel-MPI and Platform-MPI but I think you have to license those? ==rob On J

Re: [OMPI users] bug in MPI_File_set_view?

2014-05-19 Thread Rob Latham
about how to handle these special cases, memory errors such as you report can happen. ==rob Thanks Edgar On 5/15/2014 3:56 AM, CANELA-XANDRI Oriol wrote: Hi, I installed and tried with version 1.8.1 but it also fails. I see the error when there are some processes without any matrix block. It&#

Re: [OMPI users] ROMIO bug reading darrays

2014-05-08 Thread Rob Latham
On 05/07/2014 11:36 AM, Rob Latham wrote: On 05/05/2014 09:20 PM, Richard Shaw wrote: Hello, I think I've come across a bug when using ROMIO to read in a 2D distributed array. I've attached a test case to this email. Thanks for the bug report and the test case. I've o

Re: [OMPI users] ROMIO bug reading darrays

2014-05-08 Thread Rob Latham
self-contained tests. I found the problem in MPICH, but i don't know how it relates to OpenMPI -- the darray bug is one I introduced on tuesday, so OpenMPI's ROMIO code should not have a problem with this darray type. ==rob In the testcase I first initialise an array of 25 doubles (w

Re: [OMPI users] ROMIO bug reading darrays

2014-05-08 Thread Rob Latham
On 05/07/2014 11:36 AM, Rob Latham wrote: On 05/05/2014 09:20 PM, Richard Shaw wrote: Hello, I think I've come across a bug when using ROMIO to read in a 2D distributed array. I've attached a test case to this email. Thanks for the bug report and the test case. I've o

Re: [OMPI users] ROMIO bug reading darrays

2014-05-07 Thread Rob Latham
On 05/07/2014 03:10 PM, Richard Shaw wrote: Thanks Rob. I'll keep track of it over there. How often do updated versions of ROMIO get pulled over from MPICH into OpenMPI? On a slightly related note, I think I heard that you had fixed the 32bit issues in ROMIO that were causing it to break

Re: [OMPI users] ROMIO bug reading darrays

2014-05-07 Thread Rob Latham
fault, not OpenMPI's fault... until I can prove otherwise ! :>) http://trac.mpich.org/projects/mpich/ticket/2089 ==rob In the testcase I first initialise an array of 25 doubles (which will be a 5x5 grid), then I create a datatype representing a 5x5 matrix distributed in 3x3 blocks o

Re: [OMPI users] File locking in ADIO, OpenMPI 1.6.4

2014-04-14 Thread Rob Latham
off data sieving writes, which is what I would have first guessed would trigger this lock message. So I guess you are hitting one of the other cases. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-03-28 Thread Rob Latham
On 03/27/2014 06:26 PM, Edgar Gabriel wrote: I will resubmit a new patch, Rob sent me a pointer to the correct solution. Its on my to do list for tomorrow/this weekend. I also found a bad memcopy (i was taking the size of a pointer to a thing instead of the size of the thing itself), but

Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-03-25 Thread Rob Latham
afternoon again, it might be friday until I can digg into that. Was there any progress with this? Otherwise, what version of PVFS2 is known to work with OMPI 1.6? Thanks. Edgar, should I pick this up for MPICH, or was this fix specific to OpenMPI ? ==rob -- Rob Latham Mathematics and Computer

Re: [OMPI users] MPI_FILE_READ: wrong file-size does not raise an exception

2013-11-15 Thread Rob Latham
size MPI_FILE_READ(...) returns 'MPI_SUCCESS: no errors'. Well, read values are just a mess. Does anyone have an idea how to catch such an error? The read succeeded. You would consult the status object to see how many bytes were actually read. ==rob

Re: [OMPI users] MPIIO max record size

2013-07-19 Thread Rob Latham
be wrong. > > > > I think but I am not sure that it is because the MPI I/O (ROMIO) > code is the same for all distributions... > > It has been written by Rob Latham. Hello! Rajeev wrote it when he was in grad school, then he passed the torch to Rob Ross when he was a

Re: [OMPI users] opening a file with MPI-IO

2013-07-19 Thread Rob Latham
RN, which is used for that purpose in C. It's important to note that MPI-IO routines *do* use ERROR_RETURN as the error handler, so you will have to take the additional step of setting that. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] Romio and OpenMPI builds

2013-01-21 Thread Rob Latham
site. > > If you have any choice at all please do not enable NFS. It will work, barely, but I'd rather see your users treat NFS as the broken horrorshow it is (with respect to MPI-IO) than to use it at all. ==rob > >Would this be a good recommendation for us to include in

Re: [OMPI users] Invalid filename?

2013-01-21 Thread Rob Latham
mestamps in the file name: http://trac.mcs.anl.gov/projects/parallel-netcdf/browser/trunk/README#L19 (That note went into the README in June 2005. I guess I should have done a better job advertising that ROMIO er, "quirk"? in the intervening seven (!) years. ) ==rob > On Jan 21,

Re: [OMPI users] Can't read more than 2^31 bytes with MPI_File_read, regardless of type?

2012-08-07 Thread Rob Latham
penMPI uses). Been on my list of "things to fix" for a while. ==rob > Output from ompi_info --all for the 1.4.4 build is also attached. > > > OpenMPI 1.4.4 > > Trying 268435457 of float, 1073741828 bytes: successfully read 268435457 > Trying 536870913 of

Re: [OMPI users] ROMIO Podcast

2012-02-22 Thread Rob Latham
On Tue, Feb 21, 2012 at 05:30:20PM -0500, Rayson Ho wrote: > On Tue, Feb 21, 2012 at 12:06 PM, Rob Latham wrote: > > ROMIO's testing and performance regression framework is honestly a > > shambles.  Part of that is a challenge with the MPI-IO interface > > itself.  For

Re: [OMPI users] ROMIO Podcast

2012-02-21 Thread Rob Latham
g but make the testing "surface area" a lot larger. We are probably going to have a chance to improve things greatly with some recently funded proposals. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] ROMIO Podcast

2012-02-21 Thread Rob Latham
a long time, then switched to SVN in I think 2007? I am way late to the git party, but git-svn is looking mighty attractive as a first step towards transitioning to full git. One more awful svn merge might be enough to push us over the edge. ==rob -- Rob Latham Mathematics and Computer Scienc

Re: [OMPI users] IO performance

2012-02-06 Thread Rob Latham
some nice ROMIO optimizations that will help you out with writes to GPFS if you set the "striping_unit" hint to the GPFS block size. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] MPI_File_Write

2011-11-29 Thread Rob Latham
here, or each processor will end up writing the same data to the same location in the file. If you duplicate the work identically to N processors then yeah, you will take N times longer. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] maximum size for read buffer in MPI_File_read/write

2011-09-27 Thread Rob Latham
g a bit. in general, if you plot "i/o performance vs blocksize", every file system tops out around several tens of megabytes. So, we have given the advice to just split up this nearly 2 gb request into several 1 gb requests. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] IO issue with OpenMPI 1.4.1 and earlier versions

2011-09-13 Thread Rob Latham
or ways to improve our test coverage. While it looks like this workload has been fixed in recent versions of code, I'd like to include your test case to help us catch any regressions we might introduce down the line. I'd change it to be straight c and have rank 0 read back the file w

[OMPI users] mpiexec option for node failure

2011-09-12 Thread Rob Stewart
mpiexec or some other means? -- Rob Stewart

Re: [OMPI users] MPIIO and EXT3 file systems

2011-08-29 Thread Rob Latham
and friends are broken for XFS or EXT3, those kinds of bugs get a lot of attention :> At this point the usual course of action is "post a small reproducing test case". Your first message said this was a big code, so perhaps that will not be so easy... ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] MPIIO and EXT3 file systems

2011-08-22 Thread Rob Latham
s. Do you use MPI datatypes to describe either a file view or the application data? These noncontiguous in memory and/or noncontiguous in file access patterns will also trigger fcntl lock calls. You can use an MPI-IO hint to disable data sieving, at a potentially disastrous performance cost. ==r

Re: [OMPI users] parallel I/O on 64-bit indexed arays

2011-08-05 Thread Rob Latham
o so. Do you want to rebuild your MPI library on BlueGene? I can pretty quickly generate and send a patch that will make ordered mode go whip fast. ==rob > > Troels > > On 6/7/11 15:04 , Jeff Squyres wrote: > >On Jun 7, 2011, at 4:53 AM, Troels Haugboelle wrote: > &g

Re: [OMPI users] File seeking with shared filepointer issues

2011-07-05 Thread Rob Latham
value of the shared file pointer, - Rank 0 did so before any other process read the value of the shared file pointer (the green bar) Anyway, this is all known behavior. collecting the traces seemed like a fun way to spend the last hour on friday before the long (USA) weekend :> ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] File seeking with shared filepointer issues

2011-07-01 Thread Rob Latham
On Sat, Jun 25, 2011 at 06:54:32AM -0400, Jeff Squyres wrote: > Rob -- can you comment on this, perchance? Is this a bug in ROMIO, or if > not, how is one supposed to use this interface can get consistent answers in > all MPI processes? Maybe the problem here is that shared file poin

Re: [OMPI users] File seeking with shared filepointer issues

2011-07-01 Thread Rob Latham
ect of linear message passing behind the scenes, but > that seems like a weird interface. > > Rob -- can you comment on this, perchance? Is this a bug in ROMIO, or if > not, how is one supposed to use this interface can get consistent answers in > all MPI processes? man, what a w

Re: [OMPI users] reading from a file

2011-05-24 Thread Rob Latham
where decomposing the dataset over N processors will be more straightforward. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] Trouble with MPI-IO

2011-05-24 Thread Rob Latham
m not an expert on the MPI IO stuff -- your code *looks* > right to me, but I could be missing something subtle in the interpretation of > MPI_FILE_SET_VIEW. I tried running your code with MPICH 1.3.2p1 and it also > hung. > > Rob (ROMIO guy) -- can you comment this code? Is

Re: [OMPI users] Deadlock with mpi_init_thread + mpi_file_set_view

2011-04-04 Thread Rob Latham
nsure that ROMIO's internal structures get initialized exactly once, and the delete hooks help us be good citizens and clean up on exit. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] MPI-2 I/O functions (Open MPI 1.5.x on Windows)

2011-04-04 Thread Rob Latham
; in my program. > It correctly worked on Open MPI on Linux. > I would very much appreciate any information you could send me. > I can't find it in Open MPI User's Mailing List Archives. you probably need to configure OpenMPI so that ROMIO (the MPI-IO library) is built with &

Re: [OMPI users] Deadlock with mpi_init_thread + mpi_file_set_view

2011-04-01 Thread Rob Latham
ry inside OpenMPI-1.4.3 is pretty old. I wonder if the locking we added over the years will help? Can you try openmpi-1.5.3 and report what happens? ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] questions about MPI-IO

2011-01-06 Thread Rob Latham
. MPI_SUCCESS) then call MPI_ERROR_STRING(iret, string, outlen, ierr) print *, string(1:outlen) endif end subroutine check_err external32 is a good idea but nowadays portable files are better served with something like HDF5, NetCDF-4 or Parallel-NetCDF, all of which

Re: [OMPI users] MPI-IO problem

2010-12-17 Thread Rob Latham
ot free the subarray type) - in writea you don't really need to seek and then write. You could call MPI_FILE_WRITE_AT_ALL. - You use collective I/O in writea (good for you!) but use independent I/O in writeb. Especially for a 2d subarray, you'll likely see better performance with MPI_FILE_WRITE_ALL. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] How to avoid abort when calling MPI_Finalize without calling MPI_File_close?

2010-12-01 Thread Rob Latham
that closing files comes a little earlier in the shutdown process. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] out of memory in io_romio_ad_nfs_read.c

2010-11-22 Thread Rob Latham
ted, is not going to perform very well, and will likely, despite the library's best efforts, give you incorrect results. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] a question about [MPI]IO on systems without network filesystem

2010-10-19 Thread Rob Latham
you gave in the cb_config_list. Try it and if it does/doesn't work, I'd like to hear. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] Best way to reduce 3D array

2010-04-05 Thread Rob Latham
though. Nothing prevents rank 30 from hitting that loop before rank 2 does. To ensure order, you could MPI_SEND a token around a ring of MPI processes. Yuck. One approach might be to use MPI_SCAN to collect offsets (the amount of data each process will write) and then do an MPI_FILE_WRITE_AT_ALL

Re: [OMPI users] Problems Using PVFS2 with OpenMPI

2010-01-13 Thread Rob Latham
h OpenMPI as well as an example program > because this is the first time I've attempted this so I may well be > doing something wrong. It sounds like you're on the right track. I should update the PVFS quickstart for the OpenMPI specifics. In addition to pvfs2-ping and pvfs2-ls

Re: [OMPI users] nonblocking MPI_File_iwrite() does block?

2010-01-06 Thread Rob Latham
On Mon, Nov 23, 2009 at 01:32:24PM -0700, Barrett, Brian W wrote: > On 11/23/09 8:42 AM, "Rob Latham" wrote: > > > Is it OK to mention MPICH2 on this list? I did prototype some MPI > > extensions that allowed ROMIO to do true async I/O (at least as far > >

Re: [OMPI users] nonblocking MPI_File_iwrite() does block?

2009-11-23 Thread Rob Latham
supports it). If you really need to experiment with async I/O, I'd love to hear your experiences. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] nonblocking MPI_File_iwrite() does block?

2009-11-20 Thread Rob Latham
tions to the table and reduce your overall I/O costs, perhaps even reducing them enough that you no longer miss true asynchronous I/O ? ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] MPI_File_open return error code 16

2009-10-22 Thread Rob Latham
;, ret); > } else { > MPI_File_close(&fh); > } > MPI_Finalize(); > return 0; > } The error code isn't very interesting, but if you can turn that error code into a human readable string with the MPI_Error_string() routine, then maybe you'll have a hint as to what is causing the problem. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] Parallel I/O Usage

2009-07-08 Thread Rob Latham
tually using parallel I/O the right way. I think you're OK here. What are you seeing? Is this NFS? ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

Re: [OMPI users] MPI-IO: reading an unformatted binary fortran file

2009-06-16 Thread Rob Latham
hat you've written. The MPI-IO library just provides a wrapper around C system calls, so if you created this file with fortran, you'll have to read it back with fortran. Since you eventually want to do parallel I/O, I'd suggest creating the file with MPI-IO (Even if it is MPI_FILE_WRITE from rank 0 or a single process) as well as reading it back (perhaps with MPI_FILE_READ_AT_ALL). ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA

[OMPI users] Migrating from lam-mpi

2008-05-13 Thread Rob Malpass
nd exactly what I'm after - is there a setup guide somewhere? Thanks Rob

Re: [OMPI users] Cross platform run: error occurred in MPI_Waitall...

2007-05-23 Thread Rob
aServer (es40) and HP/Itanium (64 bit) OpenMPI: 1.2.2 errnos: perl -e 'die$!=54' Connection reset by peer at -e line 1. perl -e 'die$!=61' Connection refused at -e line 1. Thanks, Rob. _

[OMPI users] Cross platform run: error occurred in MPI_Waitall...

2007-05-23 Thread Rob
][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=61 mpiexec noticed that job rank 0 with PID 1936 on node es40 exited on signal 15 (Terminated). 4 additional processes aborted (not shown) Could somebody give me a clue what has gone wrong here? Thanks, Rob

Re: [OMPI users] AlphaServers & OpenMPI

2007-05-13 Thread Rob
or will it then erroneously use the ones in /usr/bin ? Even better: is there a patch available to fix this in the 1.2.1 tarball, so that I can set the full path again with CC? (is it only the CC macro, or also F77, F90, CXX, CPP and CXXCPP?) Thanks, Rob. - Be a

Re: [OMPI users] AlphaServers & OpenMPI

2007-05-12 Thread Rob
-1.3XXX is under active development. What is recommended to use in this case for my cluster system? I would prefer that the appropriate fix also be applied to 1.2.x series, so that I can rely on the a stable version ? Thanks, Rob. - Be a better Heartthrob. Get

  1   2   >