Jeff Squyres wrote:
Arif --
Sorry for the delay in replying.
Believe it or not, almost this exact issue just came up with the IBM
Benchmark Center; they were using Open MPI with MPIRandomAccess and
experiencing problems with running out of memory. We didn't get a
full set of data and experiments run; it was somewhat odd that the
problem seemed to happen most often with the Intel compilers
(preliminary tests shows that we couldn't replicate the problem with
the gcc compiler on the same problem size).
However, the IBM Benchmark Center engineers were able to get
successful runs in by using the btl_openib_free_list_max MCA
parameter. This parameter essentially limits how much space the
lowest-level IB driver in OMPI uses for fragment lists (it's actually
fairly complex as to what it exactly does and how it helps in this
situation -- insert "waving hands" image here...). This parameter
defaults to "infinite". Setting it to a finite value can allow
MPIRandomAccess to complete; I believe that the IBC engineers used
values of 2000 and 4000 for their systems.
thanks, that's great, that worked
we are also using IBM machines (IBM x3455) but we are using the gcc
compiler that comes default with SLES 10
I have successfully run the HPCC using values 2048, 4096 and 8192; I
have kept this now at 2048 and continue testing.
Is it better if this value to be high or low?
regards,
--
Arif Ali
Software Engineer
OCF plc
Mobile: +44 (0)7970 148 122
DDI: +44 (0)114 257 2240
Office: +44 (0)114 257 2200
Fax: +44 (0)114 257 0022
Email: a...@ocf.co.uk
Web: http://www.ocf.co.uk
Support Phone: +44 (0)845 702 3829
Support E-mail: supp...@ocf.co.uk
Skype: arif_ali80
MSN: a...@ocf.co.uk
This email is confidential in that it is intended for the exclusive
attention of the addressee(s) indicated. If you are not the intended
recipient, this email should not be read or disclosed to any other
person. Please notify the sender immediately and delete this email from
your computer system. Any opinions expressed are not necessarily those
of the company from which this email was sent and, whilst to the best of
our knowledge no viruses or defects exist, no responsibility can be
accepted for any loss or damage arising from its receipt or subsequent
use of this email.