Hi Christian,
On 6/6/24 1:52 PM, Christian Marangi wrote:
On Thu, Jun 06, 2024 at 12:55:37PM +0200, Quentin Schulz wrote:
Hi Christian,
On 6/6/24 11:52 AM, Christian Marangi wrote:
On Thu, Jun 06, 2024 at 11:12:11AM +0200, Quentin Schulz wrote:
Hi Christian,
On 6/5/24 9:21 PM, Christian Marangi wrote:
This series expand the STATUS LED framework with a new color
and a big new feature. One thing that many device need is a way
to communicate to the user that the device is actually doing
something.
This is especially useful for recovery steps where an
user (for example) insert an USB drive, keep a button pressed
and the device autorecover.
There is currently no way to signal the user externally that
the bootloader is processing/recoverying aside from setting
a LED on.
A solid LED on is not enough and won't actually signal any
kind of progress.
Solution is the good old blinking LED but uboot doesn't
suggest (and support) interrupts and almost all the LED
are usually GPIO LED that doesn't support HW blink.
I haven't used it yet but we do have a cyclic framework now for things
happening in the background. I think this is a good use-case for this
framework? Something would set the blinking frequency (could be from CLI
directly, or as part of board files, or architecture, etc...) and the LED
would just blink then. This would allow to highlight stages in the boot
process, though that is not like an activity LED so if you're stuck in a
stage, you wouldn't know if something is still happening or if you're really
stuck (e.g. no packet on TFTP or TFTP very slow). The benefit is that it
would be way less intrusive than patching all commands that could make use
of that LED. Right now, this only adds support to MTD, SPI and TFTP, but
what about MMC, NVMe, USB, other net stuff (e.g. wget), etc...
Can you hint me on where is this framework? Checking the tftp code i
couldn't find extra call to it. Maybe it's attached to the schedule()
function?
https://docs.u-boot.org/en/latest/develop/cyclic.html
Thanks looks very interesting and looks handy to make use of the
watchdog for it. I will try now to rework the implementation for the sw
blink to make use of cyclic thing.
Also notice that it's really not a one setting since almost all device
have GPIO LEDs and doesn't have a way to support HW Blink so the
"activity" function needs to be called multiple time to increase the
counter and toggle the LED.
Cyclic callback would be called twice per expected blink period, where you
would toggle the GPIO (essentially making it 50% duty cycle, but could be
more fine-grained if you want a different duty cycle).
Well status LED already have CONFIG_STATUS_LED_FREQ where you can set a
value. I will just use this.
This actually only appears in the rST doc, nothing actually makes use of
this right now, so it's not something we **need** to use.
What I meant is, if you only provide a frequency, a specific, hardcoded,
pattern is expected. E.g. for 1KHz, you enable the LED for 0.5ms and
disable it for 0.5ms (or 1ms and 1ms, depending on how you see
LED_STATUS_FREQ working). Could be 0.2ms and 0.8ms but it would always
be this. How do you differentiate between "something is happening on
NAND" and "TFTP is being used" if you don't have the ability to change
the duty cycle? Or are you expecting people to have multiple LED of
different colors for that?
Also this have the extra feature that you can check if something gets
stuck in the process. The lifecycle is:
- Turn on the ACTIVITY LED at the start of the thing
- Blink once in a while (for very little task this might not happen)
- Turn off the ACTIVITY LED at the end of the thing
Soo if something goes wrong the LED would never turn OFF but would stay
solid ON.
Yes, that's something that wouldn't be covered by cyclic framework here. It
all depends what you want to do, if it's an activity LED, then we need to
hook ourselves deep into frameworks where stuff is actually happening. If
it's just to specify which stage of the boot we reached, then cyclic would
be good enough probably (register for stage 1, unregister stage1+register
for stage2 for different frequency, etc...).
The cyclic framework can reduce the implementation to just START and
STOP. We would lose the ability to know if there is an actual progress
or not tho... So maybe that is bad but honestly a TFTP transfer can be
tracked by the other machine and MTD write/erase won't magically stop
and get stalled... (and even with that they will timeout and the status
LED stop will be called anyway)
So a dumb blinking with the watchdog is O.K. This is really a simple
thing to show that something is happening (use case of recovering the
device without actually using serial)
Then it's not so much an activity LED anymore, rather a "i'm still alive
and doing X thing right now, but maybe I'm stuck who knows", e.g. a
little bit like a glorified heartbeat (I'm not saying it's bad, it's
just a different use case :) ).
More than happy to rework this to a less intrusive implementation.
Maybe this can be generalized to some generic API like task_start(),
task_processing() and task_end()? Might make more sense than having to
add specific LED function to each function?
This also likely would introduce a hit in performance if we need to toggle
the GPIO in the same thread that we do TFTP/storage medium reading/writing?
I assume we could still adapt cyclic to make it spawn a one time event
instead of looping (e.g. by unregistering itself at the end of its own
callback?).
But the penality is that bad? Unless you have the crazy idea of an
absurt low value for the freq, it would be triggered once every 200
iteration. If you are transfering MB of data you are probably on x86 or
high end device where a GPIO bit set doesn't really affect anything.
The kernel on Aarch64 is usually a few MB, sometimes a few tens of MB,
then you may have an initramfs which is also in the tens of MB.
People are trying to get U-Boot to boot into Linux ASAP, so maybe
they'll be bothered by this. If they are, they can always disable the
support for this LED status feature or improve it, so I wouldn't be too
worried about it for now since it won't be on by default anyway?
My current idea is that start() will register a cyclic and LED will
blink with watchdog. Stop will deregister it.
Sounds good :)
(AFAIK linux kernel have something similar used for all the trace
framework so having something in uboot to trace these kind of operation
might be interesting)
Indeed, that's what's being done with ledtrig_.* functions, they are however
scheduled on a workqueue and called from the subsystem directly.
I'm a bit confused also as to why we control the LED blinking from the cmd/
? E.g. for cmd/mtd.c I would assume that the changes made to the mtd
subsystem should be enough to handle those? Similarly, since UBI is for use
over NAND/MTD, shouldn't that already be handled by the MTD subsystem, and
if not, why not in the UBI subsystem instead of the CMD_UBI? One of the
issues is that you may not necessarily go through the cmd/ to do stuff with
storage medium or network (e.g. directly from board files).
It's a mix of cmd and subsystem cause you need to turn the LED on. Use
the activity function to increase the conunter and toggle the LED at
actual activity (example a mtd write command, a single block received from
TFTP)
If the cyclic thing works as I think MAYBE we can just add this to the
cmd part without having to disturb all the subsystem. And that is a much
cleaner approach.
Why isn't the subsystem turning the LED on when writing to the counter,
why does it have to go through the CMD?
If I'm not mistaken, I have my disk-activity LED in Linux working just
fine without having to rely on a userspace tool to activate it for me?
Maybe I missed something there though :)
For the net protocols, why not hook this to the net_[sg]et_*_handler for
example so it's protocol-agnostic? No clue how difficult this would be, or
if you'd rather have something like per-protocol activity?
Attaching to the net handler would provide a way too generic activity
thing and would result in a dump NIC LED. We want to show activity when
we actually want to signal stuff not when the device is executin ASM
instructions ahahah
How does one define "when we actually want to signal stuff"? The further
away you're from the subsystem, the more consumers we're going to have
to patch to add this logic.
So yes I feel adding this way in way too generic subsystem might
deviates from the idea of this. Even attaching to the write and erase
function was a bit problematic and too generic.
Finally, maybe we also want to have a Kconfig symbol per "type" of activity
to control what should be "monitored", and I would also suggest if we go
this route to have a Kconfig symbol for the frequency per "type" of activity
as well, so that one can know which activity is happening right now.
Yes that might be good but I have some fear we might explode with
Kconfig. I think it's better to first have a solid idea on how this
works and then we can think of the configuration part.
Oh, we've seen much worse with the VPL_/TPL_/SPL_// symbols :) Can come
later though if someone has a usecase for it.
Cheers,
Quentin