> If that is the case, the tolerance level might be the better approach > after all...
We should also take a look at mce_start(): int cpus = num_online_cpus(); ... /* * Wait for everyone. */ while (atomic_read(&mce_callin) != cpus) { since offline cpus will still show up to rendezvous ... perhaps "num_present_cpus()" is the right number?? -Tony