How about sending a 'ping' to a socket periodically which is monitored
by an auxiliary program that runs where the master process runs?
Also, I know you don't want to delve into the third-party libs but have
you actually tried to get to the bottom of the hang, e.g. run an strace,
attach a debu
d in romio of the v2.x series and
master.
that being said, the default io module here is ompio, and it is
currently buggy, so if you are using these series, you need to
mpirun --mca io romio314 ...
for the time being
Cheers,
Gilles
On 5/31/2016 2:27 PM, Cihan Altinay wrote:
Hello list,
I
there anything I'm missing or is this a regression?
Thanks,
Cihan
--
Cihan Altinay
Computer Scientist, Centre for Geoscience Computing
eResearch Analyst, Queensland Cyber Infrastructure Foundation
The University of Queensland
T: +61 7 334 64118 / F: +61 7 334 64134
#include
#include
#in