> On Apr 29, 2015, at 4:09 PM, Brice Goglin <brice.gog...@inria.fr> wrote:
> 
> Nothing wrong in that XML. I don't see what could be happening besides a
> node rebooting with hyper-threading enabled for random reasons.
> Please run "lstopo foo.xml" again on the node next time you get the OMPI
> failure (assuming you get a chance to log on the node before it reboots
> etc).

Thanks.  Do you understand why OpenMPI would even try to bind core #16?  I’m 
pretty sure it was a 16 task job on a 16 (physical) core machine - shouldn’t it 
try to bind 0-15 only?

                                                                                
                Noam

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to