Hi,
> Just tried the patch, it doesn't fix the freeze symptoms for me.
> It just freezes at Test #7 (Block Move) at the 4096-6114M, as many
> people seem to report on the internet.
I think I have correct patch for you (and this bug). I just posted to
the upstream forum:
https://forum.canardpc.com/archive/index.php/t-117454.html
Bug is directly inside test #7 code (test.c: block_move()). Function
calculate_chunk(&start, &end,...) returns address of the first (start)
and the last (end) word to be tested by this cpu. The variable end is
incremented so it points just after the tested memory. After that, this
big block is divided into one or more smaller blocks of size <= 256MB.
But this won't happen if integer overflow occurs during the
incrementation.
So current code may leads to different count of blocks for the last cpu
(the one with the highest memory address) and all other. Different count
of blocks means also different count of calls of function do_tick(me);
and therefore different count of calls of s_barrier();. Than the
deadlock is inevitable.
The bug can't happen if you have 8 or more CPU threads, because then
calculate_chunk() returns block of size 256MB or less. You should be
also safe with less than 5 GB of RAM (this number is not exact).
Patch is attached. It should work correctly with Debian version of
memtest86+. But check that all three chunks change the function
block_move() if you have some other patches.
Jan
--- a/test.c
+++ b/test.c
@@ -1202,7 +1202,7 @@ void block_move(int iter, int me)
} else {
pe = end;
}
- if (pe >= end) {
+ if ((pe >= end && end != 0) || (pe < p && end == 0)) {
pe = end;
done++;
}
@@ -1280,7 +1280,7 @@ void block_move(int iter, int me)
} else {
pe = end;
}
- if (pe >= end) {
+ if ((pe >= end && end != 0) || (pe < p && end == 0)) {
pe = end;
done++;
}
@@ -1359,7 +1359,7 @@ void block_move(int iter, int me)
} else {
pe = end;
}
- if (pe >= end) {
+ if ((pe >= end && end != 0) || (pe < p && end == 0)) {
pe = end;
done++;
}