On Sep 27, 2013, at 5:33 PM, Guenter Roeck <li...@roeck-us.net> wrote:

> On 09/27/2013 11:03 AM, Chris Murphy wrote:
>> 
>> On Sep 27, 2013, at 11:59 AM, Guenter Roeck <li...@roeck-us.net> wrote:
>> 
>>> On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote:
>>>> 
>>>> On Sep 27, 2013, at 11:12 AM, Guenter Roeck <li...@roeck-us.net> wrote:
>>>> 
>>>>> On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote:
>>>>>> On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg <rydb...@euromail.se> 
>>>>>> wrote:
>>>>>>>>>> This suggests that initialization may be attempted more than once. 
>>>>>>>>>> The key cache
>>>>>>>>>> is allocated only once, but the number of keys is read for each 
>>>>>>>>>> attempt.
>>>>>>>>>> 
>>>>>>>>>> No idea if that can happen, but if the number of keys can increase 
>>>>>>>>>> after
>>>>>>>>>> the first initialization attempt you would have an explanation for 
>>>>>>>>>> the crash.
>>>>>>>>> 
>>>>>>>>> Good idea, and easy enough to test with the patch below.
>>>>>>>>> 
>>>>>>>> Should we apply this patch even though it may not solve the specific 
>>>>>>>> problem ?
>>>>>>> 
>>>>>>> Yes, why not - it certainly won't hurt. I am running it right now, so
>>>>>>> it is at least run-tested.
>>>>>>> 
>>>>>>>> Again, not sure if the key count can change, but the current code is 
>>>>>>>> at the very
>>>>>>>> least inconsistent, as it keeps reading the key count without updating 
>>>>>>>> or
>>>>>>>> verifying the cache size.
>>>>>>> 
>>>>>>> Yes - I agree that the error state is far-fetched, but it is hard to
>>>>>>> see any other logical explanation. There is of course always the
>>>>>>> possibility that the problem is somewhere else completely.
>>>>>>> 
>>>>>>> Proper patch attached.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Henrik
>>>>>>> 
>>>>>>> ---
>>>>>>> 
>>>>>>> From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
>>>>>>> From: Henrik Rydberg <rydb...@euromail.se>
>>>>>>> Date: Thu, 26 Sep 2013 08:33:16 +0200
>>>>>>> Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding
>>>>>>> 
>>>>>>> After reports from Chris and Josh Boyer of a rare crash in applesmc,
>>>>>>> Guenter pointed at the initialization problem fixed below. The patch
>>>>>>> has not been verified to fix the crash, but should be applied
>>>>>>> regardless.
>>>>>>> 
>>>>>>> Reported-by: <jwbo...@fedoraproject.org>
>>>>>>> Suggested-by: Guenter Roeck <li...@roeck-us.net>
>>>>>>> Signed-off-by: Henrik Rydberg <rydb...@euromail.se>
>>>>>>> ---
>>>>>>> drivers/hwmon/applesmc.c | 11 ++++++++++-
>>>>>>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>>>>> 
>>>>>> Thanks for the quick reply.  I'll get this rolled into our kernels soon.
>>>>>> 
>>>>> I sent a pull request to Linus, so you should be able to pull it from
>>>>> the upstream kernel shortly. Would be great to get feedback if the patch
>>>>> solves the problem (or doesn't).
>>>> 
>>>> I'll start running it when it appears in koji. It's very transient, maybe 
>>>> one oops per week with lots of (other) testing. I'm not even sure if it 
>>>> happens on warm or cold boots or both.
>>>> 
>>> When you do, can you possibly trigger an event based on the warning added
>>> with the patch ? This might help us to identify if the problem fixed
>>> with the patch actually happens.
>> 
>> I don't understand the question. I'm uncertain how to trigger, and also what 
>> event.
>> 
> 
> The patch includes a new warning message.
> 
>       pr_warn("key count changed from %d to %d\n",
>                        s->key_count, count);
> 
> It would be great if there would be a means to detect if this message is seen
> in a kernel log, because it would show that the potential crash condition
> fixed with the patch was actually encountered. This would help us to determine
> if we actually fixed the problem or not.
> 
> Of course, we'll know if is wasn't fixed if the system still crashes.

Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64. 

[   10.886016] applesmc: key count changed from 261 to 1174405121

Attaching new full dmesg to the bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1011719#c11

Chris--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to