Hey everyone, I am working for longer time now with cuda-aware OpenMPI, and developed longer time back a small exceptions handling framework including MPI and CUDA exceptions. Currently I am using MPI_Abort with costum error numbers, to terminate everything elegantly, which works well, by just reading the logfile in case of a crash.
Now I was wondering how one can handle return / exit codes properly between processes, since we would like to filter non-zero exits by return code. One way is a simple Allreduce (in my case) + exit instead of Abort. But the problem seems to be the values are always "random" (since I was using negative codes), only by using MPI error codes it seems to work correctly. But usage of that is limited. Any suggestions on how to do this / how it can work properly? BR Alex [https://www.essteyr.com/wp-content/uploads/2020/02/pic-1_1568d80e-78e3-426f-85e8-4bf0051208351.png] [https://www.essteyr.com/wp-content/uploads/2021/01/ESSSignatur3.png]<https://www.essteyr.com/> [https://www.essteyr.com/wp-content/uploads/2020/02/linkedin_38a91193-02cf-4df9-8e91-230f7459e9c3.png]<https://at.linkedin.com/company/ess-engineeringsoftwaresteyr> [https://www.essteyr.com/wp-content/uploads/2020/02/twitter_5fc7318f-c0e4-495c-b96c-ebd9cf186067.png] <https://twitter.com/essteyr> [https://www.essteyr.com/wp-content/uploads/2020/02/facebook_ee01289e-1a90-48d0-8e82-049bb3c3a46b.png] <https://www.facebook.com/essteyr> [https://www.essteyr.com/wp-content/uploads/2020/09/SocialLink_Instagram_32x32_ea55186d-8d0b-4f5e-a023-02e04995f5bf.png] <https://www.instagram.com/ess_engineering_software_steyr/> [cid:QR3a6b35cf-f0bb-484c-a686-022d30599571.png] DI Alexander Stadik Head of Large Scale Solutions Research & Development | Large Scale Solutions [cid:teams_32x32_7ad2335e-d971-4370-9e6d-14fa34f6ab0e.png] Book a Meeting<https://outlook.office365.com/owa/calendar/di%20alexandersta...@essteyr.com/bookings/> Phone: +4372522044622 Company: +43725220446 Mail: alexander.sta...@essteyr.com Register of Firms No.: FN 427703 a Commercial Court: District Court Steyr UID: ATU69213102 [https://www.essteyr.com/wp-content/uploads/2018/09/pic-2_f96fc865-57a5-4ef1-a924-add9b85d55cc1.png] ESS Engineering Software Steyr GmbH * Berggasse 35 * 4400 * Steyr * Austria [https://www.essteyr.com/wp-content/uploads/2018/09/pic-2_1df6b77f-61f1-40d3-a337-0145e62afb3e1.png] This message is confidential. It may also be privileged or otherwise protected by work product immunity or other legal rules. If you have received it by mistake, please let us know by e-mail reply and delete it from your system; you may not copy this message or disclose its contents to anyone. Please send us by fax any message containing deadlines as incoming e-mails are not screened for response deadlines. The integrity and security of this message cannot be guaranteed on the Internet. <https://www.essteyr.com/event/1-worldwide-coatings-simulation-conference/>