On 9/8/20 8:01 PM, Simon Glass wrote: > Hi Sean, > > On Tue, 8 Sep 2020 at 17:59, Sean Anderson <sean...@gmail.com> wrote: >> >> On 9/8/20 7:56 PM, Simon Glass wrote: >>> Hi Sean, >>> >>> On Mon, 7 Sep 2020 at 09:51, Sean Anderson <sean...@gmail.com> wrote: >>>> >>>> On 9/7/20 9:57 AM, Simon Glass wrote: >>>>> Hi Sean, >>>>> >>>>> On Sun, 6 Sep 2020 at 20:02, Sean Anderson <sean...@gmail.com> wrote: >>>>>> >>>>>> On 9/6/20 9:43 PM, Simon Glass wrote: >>>>>>> Hi Sean, >>>>>>> >>>>>>> On Tue, 1 Sep 2020 at 13:56, Sean Anderson <sean...@gmail.com> wrote: >>>>>>>> >>>>>>>> get_ticks does not always succeed. Sometimes it can be called before >>>>>>>> the >>>>>>>> timer has been initialized. If it does, it returns a negative errno. >>>>>>>> This causes the timer to appear non-monotonic, because the value will >>>>>>>> become much smaller after the timer is initialized. >>>>>>>> >>>>>>>> No users of get_ticks which I checked handle errors of this kind. >>>>>>>> Further, >>>>>>>> functions like tick_to_time mangle the result of get_ticks, making it >>>>>>>> very >>>>>>>> unlikely that one could check for an error without suggesting a patch >>>>>>>> such >>>>>>>> as this one. >>>>>>>> >>>>>>>> This patch changes get_ticks to always return 0 when there is an error. >>>>>>>> 0 is the least unsigned integer, ensuring get_ticks appears monotonic. >>>>>>>> This >>>>>>>> has the side effect of time apparently not passing until the timer is >>>>>>>> initialized. However, without this patch, time does not pass anyway, >>>>>>>> because the error value is likely to be the same. >>>>>>>> >>>>>>>> Fixes: c8a7ba9e6a5 >>>>>>>> Signed-off-by: Sean Anderson <sean...@gmail.com> >>>>>>>> --- >>>>>>>> >>>>>>>> lib/time.c | 4 ++-- >>>>>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>>>> >>>>>>> Would it be better to panic so people can fix the bug? >>>>>> >>>>>> I thought this was expected behavior. It's only a bug if you do >>>>>> something like udelay before any timers are created. We just can't >>>>>> report errors through get_ticks, because its users assume that it always >>>>>> returns a time of some kind. >>>>> >>>>> I think it indicates a bug. If you use a device before it is ready you >>>>> don't really know what it will do. I worry that this patch is just >>>>> going to cause confusion, since the behaviour depends on when you call >>>>> it. If we panic, people can figure out why the timer is being inited >>>>> too late, or being used too early. >>>> >>>> Hm, maybe. I don't think it's as clear cut as "us[ing] a device before >>>> it is ready," because get_ticks tries to initialize the timer if it >>>> isn't already initialized. Unless someone else does it first, the first >>>> call to get_ticks will always be before the timer is initialized. >>>> >>>> The specific problem I ran into was that after relocation, the watchdog >>>> may be initialized before the timer. This occurs on RISC-V because >>>> without [1] a timer only exists after arch_early_init_r. So, for the >>>> first few calls to watchdog_reset there is no timer. >>>> >>>> The second return could probably be turned into a panic. I checked, and >>>> all current timer drivers always succeed in getting the time (except for >>>> the RISC-V timer, which is fixed in [1]), so the only way for >>>> timer_get_count to fail is if timer_ops.get_count doesn't exist. That is >>>> almost certainly an error on the driver author's part, so I think >>>> panicking there is the only reasonable option. >>> >>> OK good, let's do that and update docs in timer.h >> >> That being to panic both times, or just panic the second time? > > Well I like a panic if the call is invalid, ie. in both cases.
Ok, sounds good to me. --Sean