Rob, Thanks. I understand the issues better now. I'll let you fix ROMIO then, and I'll get on with a vol plugin for shipping data off to our BGAS nodes bypassing the current drivers so I don't have to worry about some of those issues ...
JB > -----Original Message----- > From: Rob Latham [mailto:[email protected]] > Sent: 04 September 2013 15:52 > To: Biddiscombe, John A. > Cc: HDF Users Discussion List > Subject: Re: HDF5 and GFPS optimizations > > On Tue, Sep 03, 2013 at 07:21:16PM +0000, Biddiscombe, John A. wrote: > > Rob > > > > Thanks very much for this info. I've been reading the manuals and getting > up to speed with the system. I've set some benchmarks running for parallel > IO using multiple datasets, compound data types etc etc. > > > > when you say ... > > > > > More generally, I've found that some of the default MPI-IO settings > > > are probably not ideal for /Q, and have tested/suggested a change to > > > the "number of I/O aggregators" defaults. > > > > Do you mean aggregators inside romio, or gpfs itself. > > I'm speaking about the MP-IO (romio) library. For Blue Gene, the code hasn't > changed too much since /L. Our /Q has 64x more parallelism per node than > /L, so one can imagine the assumptions made in 2004 might need to be > updated :> > > Some of that is simple tuning of defaults. We're also talking with > IBM guys about some more substantial ROMIO changes. > > > I was under the impression that on BGQ machines (which is what I'm > targeting), the IO was shipped to the IO nodes which performed aggregation > anyway. This is what I was referring to when I said "shuffling data twice" - > there's no point in hdf/mpiio performing collective IO as this task was being > done by the OS. Am I to understand that the IO nodes don't natively do a > very good job of it and need some assistance? > > The I/O nodes on Blue Gene have never been sophisticated. They relay > system calls. the end. No re-ordering, no coalescing, no caching (ok, GPFS > has a page pool on the io node, but that's GPFS doing the caching, not the I/O > node daemon, so I make a distinction). > > ==rob > > > -- > Rob Latham > Mathematics and Computer Science Division Argonne National Lab, IL USA _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
