After the 'commit b9cae27728d1 ("EDAC/ghes: Scan the system once on driver 
init")'
applied, following error has occurred in the ghes_edac_register() when
CONFIG_DEBUG_TEST_DRIVER_REMOVE is enabled. The null ghes_hw.dimms pointer
in the mci_for_each_dimm() of ghes_edac_register() caused the error.

The error occurs when all the previously initialized ghes instances are
removed and then probe a new ghes instance. In this case, the ghes_refcount
would be 0, ghes_hw.dimms and mci already freed. The ghes_hw.dimms would
be null because ghes_scan_system() would not call enumerate_dimms() again.

Suggested-by: Borislav Petkov <b...@suse.de>
Signed-off-by: Shiju Jose <shiju.j...@huawei.com>
---
 drivers/edac/ghes_edac.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index da60c29468a7..54ebc8afc6b1 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -55,6 +55,8 @@ static DEFINE_SPINLOCK(ghes_lock);
 static bool __read_mostly force_load;
 module_param(force_load, bool, 0);
 
+static bool system_scanned;
+
 /* Memory Device - Type 17 of SMBIOS spec */
 struct memdev_dmi_entry {
        u8 type;
@@ -225,14 +227,12 @@ static void enumerate_dimms(const struct dmi_header *dh, 
void *arg)
 
 static void ghes_scan_system(void)
 {
-       static bool scanned;
-
-       if (scanned)
+       if (system_scanned)
                return;
 
        dmi_walk(enumerate_dimms, &ghes_hw);
 
-       scanned = true;
+       system_scanned = true;
 }
 
 void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
@@ -631,6 +631,8 @@ void ghes_edac_unregister(struct ghes *ghes)
 
        mutex_lock(&ghes_reg_mutex);
 
+       system_scanned = false;
+
        if (!refcount_dec_and_test(&ghes_refcount))
                goto unlock;
 
-- 
2.17.1


Reply via email to