On 3/3/20 4:53 PM, Lukas Auer wrote:
> On Mon, 2020-03-02 at 18:43 -0500, Sean Anderson wrote:
>> On 3/2/20 6:17 PM, Lukas Auer wrote:
>>> Don't move this. It is intended to be run before the IPI is cleared.
>>
>> Hm, ok. I think I moved it to after because of the 'if (!smp_function)'
>> check, but those two don't really need to be done together.
>>
> 
> Thanks! We had problems with code corruption in some situations,
> because some secondary harts entered OpenSBI after the main hart while
> OpenSBI expected all harts to be running OpenSBI by that time. Moving
> this code block was part of the fix for this situation, see [1].
> 
> [1]: 
> https://gitlab.denx.de/u-boot/u-boot/commit/90ae28143700bae4edd23930a7772899ad259058

Ah, this makes a lot more sense why it was located where it was.

>>>>    /*
>>>>     * Clear the IPI to acknowledge the request before jumping to the
>>>>     * requested function.
>>>>     */
>>>>    ret = riscv_clear_ipi(hart);
>>>>    if (ret) {
>>>> -          pr_err("Cannot clear IPI of hart %ld\n", hart);
>>>> +          pr_err("Cannot clear IPI of hart %ld (error %d)\n", hart, ret);
>>>>            return;
>>>>    }
>>>>  
>>>> +  __smp_mb();
>>>> +
>>>> +  smp_function = (void (*)(ulong, ulong, ulong))gd->arch.ipi[hart].addr;
>>>> +  /*
>>>> +   * There may be an IPI raised before u-boot begins execution, so check
>>>> +   * to ensure we actually have a function to call.
>>>> +   */
>>>> +  if (!smp_function)
>>>> +          return;
>>>> +  log_debug("hart = %lu func = %p\n", hart, smp_function);
>>>
>>> The log messages might be corrupted if multiple harts are calling the
>>> log function here. I have not looked into the details so this might not
>>> be an issue. In that case it is fine to keep, otherwise please remove
>>> it.
>>
>> I ran into this problem a lot when debugging. I ended up implementing a
>> spinlock around puts/putc. I agree it's probably better to remove this,
>> but I worry that concurrency bugs will become much harder to track down
>> without some kind of feedback. (This same criticism applies to the log
>> message above as well).
>>
> 
> Especially with your changes, I hope we already have or will soon reach
> a code robustness level where we won't have too many concurrency bugs
> in the future. :)
> Let's remove it for now until the logging backend can handle this
> cleanly.

Ok. Should the error message above ("Cannot clear IPI of hart...") also
be removed? I found it tended to corrupt the log output if it was ever
triggered.

--Sean

Reply via email to