>> Btw, this driver is polling, AFAICT. Doesn't e3-12xx support the CMCI >> interrupt which you can feed into this driver directly and thus not need >> the polling at all? > > On the system with the ce and ue events that I'm testing on, I don't see > 'MCE' nudge above 0, in /proc/interrupts. So I think that implies that > we are not getting any CMCI there?
CMCI will bump up the "THR" (Threshold) entries in /proc/interrupts. > So if possible maybe we can confirm with Intel whether we expect an MCE > for memory errors... MCG_CAP bit 10 tells you whether a given processor implements CMCI. If that is set - then MCi_CTL2 bit 30 indicates whether a given bank supports it (Linux tries to set this bit, if it sticks, then it knows that CMCI is supported - Linux also assigns ownership of the bank to the first cpu to successfully set it (since a bank may be shared by multiple threads/cores on a package). Consumed uncorrectable errors should generate a machine check. Which on the E3-12xx series will be a fatal machine check: MCi_STATUS.PCC=1 -Tony N�����r��y����b�X��ǧv�^�){.n�+����{����zX����ܨ}���Ơz�&j:+v�������zZ+��+zf���h���~����i���z��w���?�����&�)ߢf��^jǫy�m��@A�a��� 0��h���i