hi Rolf, unfortunately, i couldn't get rid of that annoying segmentation fault when selecting another bcast algorithm. i'm now going to replace MPI_Bcast with a naive implementation (using MPI_Send and MPI_Recv) and see if that helps.
regards, éloi On Wednesday 14 July 2010 10:59:53 Eloi Gaudry wrote: > Hi Rolf, > > thanks for your input. You're right, I miss the > coll_tuned_use_dynamic_rules option. > > I'll check if I the segmentation fault disappears when using the basic > bcast linear algorithm using the proper command line you provided. > > Regards, > Eloi > > On Tuesday 13 July 2010 20:39:59 Rolf vandeVaart wrote: > > Hi Eloi: > > To select the different bcast algorithms, you need to add an extra mca > > parameter that tells the library to use dynamic selection. > > --mca coll_tuned_use_dynamic_rules 1 > > > > One way to make sure you are typing this in correctly is to use it with > > ompi_info. Do the following: > > ompi_info -mca coll_tuned_use_dynamic_rules 1 --param coll > > > > You should see lots of output with all the different algorithms that can > > be selected for the various collectives. > > Therefore, you need this: > > > > --mca coll_tuned_use_dynamic_rules 1 --mca coll_tuned_bcast_algorithm 1 > > > > Rolf > > > > On 07/13/10 11:28, Eloi Gaudry wrote: > > > Hi, > > > > > > I've found that "--mca coll_tuned_bcast_algorithm 1" allowed to switch > > > to the basic linear algorithm. Anyway whatever the algorithm used, the > > > segmentation fault remains. > > > > > > Does anyone could give some advice on ways to diagnose the issue I'm > > > facing ? > > > > > > Regards, > > > Eloi > > > > > > On Monday 12 July 2010 10:53:58 Eloi Gaudry wrote: > > >> Hi, > > >> > > >> I'm focusing on the MPI_Bcast routine that seems to randomly segfault > > >> when using the openib btl. I'd like to know if there is any way to > > >> make OpenMPI switch to a different algorithm than the default one > > >> being selected for MPI_Bcast. > > >> > > >> Thanks for your help, > > >> Eloi > > >> > > >> On Friday 02 July 2010 11:06:52 Eloi Gaudry wrote: > > >>> Hi, > > >>> > > >>> I'm observing a random segmentation fault during an internode > > >>> parallel computation involving the openib btl and OpenMPI-1.4.2 (the > > >>> same issue can be observed with OpenMPI-1.3.3). > > >>> > > >>> mpirun (Open MPI) 1.4.2 > > >>> Report bugs to http://www.open-mpi.org/community/help/ > > >>> [pbn08:02624] *** Process received signal *** > > >>> [pbn08:02624] Signal: Segmentation fault (11) > > >>> [pbn08:02624] Signal code: Address not mapped (1) > > >>> [pbn08:02624] Failing at address: (nil) > > >>> [pbn08:02624] [ 0] /lib64/libpthread.so.0 [0x349540e4c0] > > >>> [pbn08:02624] *** End of error message *** > > >>> sh: line 1: 2624 Segmentation fault > > >>> > > >>> \/share\/hpc3\/actran_suite\/Actran_11\.0\.rc2\.41872\/RedHatEL\-5\/x > > >>> 86 _6 4\ /bin\/actranpy_mp > > >>> '--apl=/share/hpc3/actran_suite/Actran_11.0.rc2.41872/RedHatEL-5/x86_ > > >>> 64 /A c tran_11.0.rc2.41872' > > >>> '--inputfile=/work/st25652/LSF_130073_0_47696_0/Case1_3Dreal_m4_n2.da > > >>> t' '--scratch=/scratch/st25652/LSF_130073_0_47696_0/scratch' > > >>> '--mem=3200' '--threads=1' '--errorlevel=FATAL' '--t_max=0.1' > > >>> '--parallel=domain' > > >>> > > >>> If I choose not to use the openib btl (by using --mca btl self,sm,tcp > > >>> on the command line, for instance), I don't encounter any problem and > > >>> the parallel computation runs flawlessly. > > >>> > > >>> I would like to get some help to be able: > > >>> - to diagnose the issue I'm facing with the openib btl > > >>> - understand why this issue is observed only when using the openib > > >>> btl and not when using self,sm,tcp > > >>> > > >>> Any help would be very much appreciated. > > >>> > > >>> The outputs of ompi_info and the configure scripts of OpenMPI are > > >>> enclosed to this email, and some information on the infiniband > > >>> drivers as well. > > >>> > > >>> Here is the command line used when launching a parallel computation > > >>> > > >>> using infiniband: > > >>> path_to_openmpi/bin/mpirun -np $NPROCESS --hostfile host.list > > >>> --mca > > >>> > > >>> btl openib,sm,self,tcp --display-map --verbose --version --mca > > >>> mpi_warn_on_fork 0 --mca btl_openib_want_fork_support 0 [...] > > >>> > > >>> and the command line used if not using infiniband: > > >>> path_to_openmpi/bin/mpirun -np $NPROCESS --hostfile host.list > > >>> --mca > > >>> > > >>> btl self,sm,tcp --display-map --verbose --version --mca > > >>> mpi_warn_on_fork 0 --mca btl_openib_want_fork_support 0 [...] > > >>> > > >>> Thanks, > > >>> Eloi > > > > > > _______________________________________________ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Eloi Gaudry Free Field Technologies Company Website: http://www.fft.be Company Phone: +32 10 487 959