Kris,
Using MX_CSUM should _not_ make a difference by itself. But it
requires the debug library which may alter the timing enough to avoid
a race (in MX, OMPI, or the application).
Correct, if you use the MTL then all messages are handled by MX
(internode, shared memory and self).
Scott
Scott,
Thanks for your advice! Good to know about the checksum debug
functionality! Strangely enough running with either "MX_CSUM=1" or "-mca
pml cm" allows Murasaki to work normally, and makes the test case I
attached in my previous mail work. Very suspicious, but at least this
does make a functi
Hi Kris,
I have not run your code yet, but I will try to this weekend.
You can have MX checksum its messages if you set MX_CSUM=1 and use the
MX debug library (e.g. LD_LIBRARY_PATH to /opt/mx/lib/debug).
Do you have the problem if you use the MX MTL? To test it modify your
mpirun as follow
Hi. I've now spent many many hours tracking down a bug that was causing
my program to die, as though either its memory were getting corrupted or
messages were getting clobbered while going through the network, I
couldn't tell which. I really wish the checksum flag on btl_mx_flags
were working. But