Hi,

I'm focusing on the MPI_Bcast routine that seems to randomly segfault when 
using the openib btl.
I'd like to know if there is any way to make OpenMPI switch to a different 
algorithm than the default one being selected for MPI_Bcast.

Thanks for your help,
Eloi 

On Friday 02 July 2010 11:06:52 Eloi Gaudry wrote:
> Hi,
> 
> I'm observing a random segmentation fault during an internode parallel
> computation involving the openib btl and OpenMPI-1.4.2 (the same issue
> can be observed with OpenMPI-1.3.3).
>    mpirun (Open MPI) 1.4.2
>    Report bugs to http://www.open-mpi.org/community/help/
>    [pbn08:02624] *** Process received signal ***
>    [pbn08:02624] Signal: Segmentation fault (11)
>    [pbn08:02624] Signal code: Address not mapped (1)
>    [pbn08:02624] Failing at address: (nil)
>    [pbn08:02624] [ 0] /lib64/libpthread.so.0 [0x349540e4c0]
>    [pbn08:02624] *** End of error message ***
>    sh: line 1:  2624 Segmentation fault
> \/share\/hpc3\/actran_suite\/Actran_11\.0\.rc2\.41872\/RedHatEL\-5\/x86_64\
> /bin\/actranpy_mp
> '--apl=/share/hpc3/actran_suite/Actran_11.0.rc2.41872/RedHatEL-5/x86_64/Ac
> tran_11.0.rc2.41872'
> '--inputfile=/work/st25652/LSF_130073_0_47696_0/Case1_3Dreal_m4_n2.dat'
> '--scratch=/scratch/st25652/LSF_130073_0_47696_0/scratch' '--mem=3200'
> '--threads=1' '--errorlevel=FATAL' '--t_max=0.1' '--parallel=domain'
> 
> If I choose not to use the openib btl (by using --mca btl self,sm,tcp on
> the command line, for instance), I don't encounter any problem and the
> parallel computation runs flawlessly.
> 
> I would like to get some help to be able:
> - to diagnose the issue I'm facing with the openib btl
> - understand why this issue is observed only when using the openib btl
> and not when using self,sm,tcp
> 
> Any help would be very much appreciated.
> 
> The outputs of ompi_info and the configure scripts of OpenMPI are
> enclosed to this email, and some information on the infiniband drivers
> as well.
> 
> Here is the command line used when launching a parallel computation
> using infiniband:
>    path_to_openmpi/bin/mpirun -np $NPROCESS --hostfile host.list --mca
> btl openib,sm,self,tcp  --display-map --verbose --version --mca
> mpi_warn_on_fork 0 --mca btl_openib_want_fork_support 0 [...]
> and the command line used if not using infiniband:
>    path_to_openmpi/bin/mpirun -np $NPROCESS --hostfile host.list --mca
> btl self,sm,tcp  --display-map --verbose --version --mca
> mpi_warn_on_fork 0 --mca btl_openib_want_fork_support 0 [...]
> 
> Thanks,
> Eloi


Reply via email to