On Sat, Apr 20, 2019 at 11:47 AM Borislav Petkov <b...@alien8.de> wrote: > IOW, when you have the CEC enabled, you don't need to log memory errors > with a userspace agent. The CEC collects them and discards them if they > don't repeat.
So, you mean breaking mcelog is intentionally, if so, why not break it loudly? That is, for example, preventing mcelog from starting by disabling CONFIG_X86_MCELOG_LEGACY in Kconfig _automatically_ when CONFIG_RAS is enabled? (Like what I showed in my PoC change.) Or, for another example, print a kernel warning and let users know this behavior is intentional? > > If they do repeat, then it offlines the page. > > Without user intervention and interference. > > Now, if you still want to know how many errors and where they happened > and when they happened and yadda yadda, you *disable* the CEC. Well, I believe rasdaemon has the counters too, it is not hard to count the trace events at all. I don't worry about this at all. What I worry is how we treat mcelog when having CONFIG_RAS=y. > > I hope this makes more sense now. Yes, thanks for the information. It is kinda what I expected, as I keep saying, I believe we can improve this situation to avoid users' confusion, rather than just saying CONFIG_RAS=n is the answer. Thanks.