Sorry for the interruption. I back on mpi tracks again.

I have rebuilt openmpi-1.0.2a9 with -g and the error is unchanged.

I have also discovered that I don't need to run any openmpi application to show up the error.

mpirun --help or mpirun show up the same error:
valiron@icare ~ > mpirun
*Segmentation fault (core dumped)

and
valiron@icare ~ > pstack core
core 'core' of 13842:   mpirun
fffffd7ffee9dfe0 strlen () + 20
fffffd7ffeef6ab3 vsprintf () + 33
fffffd7fff180fd1 opal_vasprintf () + 41
fffffd7fff180f88 opal_asprintf () + 98
00000000004098a3 orterun () + 63
0000000000407214 main () + 34
000000000040708c ???????? ()

Seems very basic !

Using dbx produces a little more info, unfortunately cryptic for me:

valiron@icare ~ > dbx /users/valiron/lib/openmpi-1.0.2a9/bin/mpirun
For information about new features see `help changes'
To remove this message, put `dbxenv suppress_startup_message 7.5' in your .dbxrc
Reading mpirun
Reading ld.so.1
Reading liborte.so.0.0.0
Reading libopal.so.0.0.0
Reading libdl.so.1
Reading libm.so.2
Reading libnsl.so.1
Reading libsocket.so.1
Reading libthread.so.1
Reading libc.so.1
(dbx) run
Running: mpirun
(process id 13881)
t@1 (l@1) signal SEGV (no mapping at the fault address) in strlen at 0xfffffd7ffee9dfe0
0xfffffd7ffee9dfe0: strlen+0x0020:      cmpb     $0x0000000000000000,(%rsi)
Current function is opal_vasprintf (optimized)
 206       length = vsprintf(*ptr, fmt, ap);
(dbx)

For information I copied the man page for vsprintf()

Standard C Library Functions                          vprintf(3C)

NAME
    vprintf, vfprintf, vsprintf,  vsnprintf  -  print  formatted
    output of a variable argument list

SYNOPSIS
    #include <stdio.h>
    #include <stdarg.h>

    int vprintf(const char *format, va_list ap);

    int vfprintf(FILE *stream, const char *format, va_list ap);

    int vsprintf(char *s, const char *format, va_list ap);

    int vsnprintf(char *s, size_t n, const char *format, va_list
    ap);

DESCRIPTION
    The vprintf(), vfprintf(), vsprintf() and vsnprintf()  func-
    tions  are  the  same as printf(), fprintf(), sprintf(), and
    snprintf(),  respectively,  except  that  instead  of  being
    called  with a variable number of arguments, they are called
    with an argument list as defined in the  <stdarg.h>  header.
    See printf(3C).

    The <stdarg.h> header defines the type va_list and a set  of
    macros  for  advancing  through  a  list  of arguments whose
    number and types may vary. The argument  ap  to  the  vprint
    family  of  functions  is  of type va_list. This argument is
    used with the  <stdarg.h>  header  file  macros  va_start(),
    va_arg(),  and  va_end()  (see  stdarg(3EXT)). The  EXAMPLES
    section  below  demonstrates  the  use  of  va_start()   and
    va_end() with  vprintf().

    The macro va_alist() is used as  the  parameter  list  in  a
    function  definition,  as in the function called  error() in
    the example below.  The macro va_start(ap, parmN), where  ap
    is  of  type  va_list  and  parmN is the rightmost parameter
    (just before ...), must be  called  before  any  attempt  to
    traverse   and   access   unnamed  arguments  is  made.  The
    va_end(ap) macro must be invoked when all desired  arguments
    have been accessed. The argument list in ap can be traversed
    again if va_start() is called again after va_end().  In  the
    example  below, the error() arguments (arg1,  arg2, ...) are
    passed to vfprintf() in the argument ap.

RETURN VALUES
    Refer to printf(3C).

ERRORS
    The vprintf() and vfprintf() functions will fail  if  either
    the stream is unbuffered or the stream's buffer needed to be
    flushed and:

    EFBIG           The file is a regular file  and  an  attempt
                    was  made  to  write at or beyond the offset
                    maximum.



Any idea ?

Of course I would be glad to provide an account to the machine (but for security reasons not on the list...).

Pierre.



Brian Barrett wrote:
On Feb 27, 2006, at 8:50 AM, Pierre Valiron wrote:

- Make completed nicely, excepted compiling ompi/mpi/f90/mpi.f90 which took nearly half an hour to complete. I suspect the optimization flags in FFLAGS are not important for applications, and I could use -O0 or -O1 instead.

You probably won't see any performance impact at all if you compile the Fortran 90 layer of Open MPI with no optimizations. It's a very thin wrapper and the compiler isn't going to be able to do much with it anyway. One other thing - if you know your F90 code never sends arrays greater than dimension X (X defaults to 4), you can speed things up immensly by configuring Open MPI with the option --with-f90- max-array-dim=X.

- However the resulting executable fails to launch:
valiron@icare ~/config > mpirun --prefix /users/valiron/lib/ openmpi-1.0.2a9 -np 2 a.out
Segmentation fault (core dumped)

- The problem seems buried into open-mpi:
valiron@icare ~/config > pstack core
core 'core' of 27996: mpirun --prefix /users/valiron/lib/ openmpi-1.0.2a9 -np 2 a.out
fffffd7fff05dfe0 strlen () + 20
fffffd7fff0b6ab3 vsprintf () + 33
fffffd7fff2e4211 opal_vasprintf () + 41
fffffd7fff2e41c8 opal_asprintf () + 98
00000000004098a3 orterun () + 63
0000000000407214 main () + 34
000000000040708c ???????? ()

Ugh... Yes, we're probably doing something wrong there. Unfortunately, neither Jeff nor I have access to an Opteron box running Solaris and I can't replicate the problem on either a UltraSparc running Solaris or an Opteron running Linux. Could you compile Open MPI with CFLAGS set to "-g -O -xtarget=opteron - xarch=amd64". Hopefully being able to see the callstack with some line numbers will help a bit.

Brian



--
Soutenez le mouvement SAUVONS LA RECHERCHE :
http://recherche-en-danger.apinc.org/

      _/_/_/_/    _/       _/       Dr. Pierre VALIRON
     _/     _/   _/      _/   Laboratoire d'Astrophysique
    _/     _/   _/     _/    Observatoire de Grenoble / UJF
   _/_/_/_/    _/    _/    BP 53  F-38041 Grenoble Cedex 9 (France)
  _/          _/   _/    http://www-laog.obs.ujf-grenoble.fr/~valiron/
 _/          _/  _/     Mail: pierre.vali...@obs.ujf-grenoble.fr
_/          _/ _/      Phone: +33 4 7651 4787  Fax: +33 4 7644 8821
_/ _/_/


Reply via email to