> -----Original Message----- > From: linux-edac-ow...@vger.kernel.org <linux-edac- > ow...@vger.kernel.org> On Behalf Of Borislav Petkov > Sent: Wednesday, March 28, 2018 9:00 AM > To: Ghannam, Yazen <yazen.ghan...@amd.com> > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH 1/3] EDAC/amd64: Print ECC enabled/disabled for nodes > with enabled MCs > > On Wed, Mar 21, 2018 at 02:13:33PM -0500, Yazen Ghannam wrote: > > From: Yazen Ghannam <yazen.ghan...@amd.com> > > > > It's possible that a system can be used without any DRAM populated on > > one or more physical Dies on multi-die systems. Firmware will not > > enable DRAM ECC on Dies without DRAM. Users will then see a message > > about DRAM ECC disabled on those nodes without DRAM. However, DRAM > ECC > > may, in fact, be enabled on the other Dies that have DRAM. > > > > Only print ECC enabled/disabled information for nodes that have at least > > one enabled memory channel. > > So if the only reason for this is make the error messages more precise, > then let's not make it uglier than it is. > > The right way to do it would be to push those checks down to > debug_display_dimm_sizes* which looks at the CS rows and the chip select > enable bits and there to differentiate between > > * memory controller doesn't have DIMMs > > and > > * memory controller has DIMMs but ECC is disabled in the BIOS > > and then print the respective informative error message. But not with a > yet another boolean which kinda takes care of F17h only and leaves the > old families as they were. >
In either of those cases we won't get to debug_display_dimm_sizes* because we won't initialize the instance. We have a message for "memory controller doesn't have DIMMs". edac_dbg(0, "Node %d: No enabled UMCs.\n", nid); We can change the wording of this to make it more clear. Also, we can make this a pr_info and return from here. So something like this: diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c index 329cb96f886f..6e211ed6d419 100644 --- a/drivers/edac/amd64_edac.c +++ b/drivers/edac/amd64_edac.c @@ -3061,10 +3061,12 @@ static bool ecc_enabled(struct pci_dev *F3, u16 nid) } /* Check whether at least one UMC is enabled: */ - if (umc_en_mask) + if (umc_en_mask) { ecc_en = umc_en_mask == ecc_en_mask; - else - edac_dbg(0, "Node %d: No enabled UMCs.\n", nid); + } else { + amd64_info("Node %d: No DIMMs populated on memory controller.\n", nid); + return false; + } /* Assume UMC MCA banks are enabled. */ nb_mce_en = true; This would work for Fam17h. For older systems I think we can look at D18F2x94_dct[1:0][DisDramInterface] Or maybe we have a separate function to check for enabled memory controllers before we check for ECC? Thanks, Yazen