Hi,

I don't think updating the comment can do any harm so why not. And since I have a fresh experience of trying to add support for new AVR family/board, I can certainly imagine that I would be confused if I noticed the discrepancy between the comment about the variable and its initialization value.

diff --git a/arch/avr/src/common/avr_internal.h b/arch/avr/src/common/avr_internal.h
index 2430b99b89a..c518913c287 100644
--- a/arch/avr/src/common/avr_internal.h
+++ b/arch/avr/src/common/avr_internal.h
@@ -95,7 +95,7 @@ extern uint8_t g_intstacktop[];

 extern uint8_t _stext[];           /* Start of .text */
 extern uint8_t _etext[];           /* End_1 of .text + .rodata */
-extern const uint8_t _eronly[]; /* End+1 of read only section (.text + .rodata) */ +extern const uint8_t _eronly[]; /* Start of .data section in FLASH */
 extern uint8_t _sdata[];           /* Start of .data */
 extern uint8_t _edata[];           /* End+1 of .data */
 extern uint8_t _sbss[];            /* Start of .bss */

This is what every other comment about the variable says so it's probably fine like that.

As for the .trampolines section, I'm kind of wondering if the linker script can safely be kept as it is now (and have linker figure things out) or if it's safer to have the section in there explicitly.

On 2025-05-29 17:08, Matteo Golin wrote:
Hello KR,

No need to apologize! Thank you for all your help on this issue. I actually didn't know about these "trampoline" sections, I had to look them up after your email. If you think it might make the code clearer, I can update the comments about `_eronly` to reflect that it is no longer the end of .text and .rodata, and instead just the start of .data? I can also mention these
trampoline sections in between.

What do you think?
Matteo

On Wed, May 28, 2025 at 8:11 PM <kr....@kerogit.eu> wrote:

Well, I think I managed to figure it out.

First, an apology - I didn't read the patch in #16457 properly and
thought it just moves the _eronly assignment while it, in fact, changes
ABSOLUTE to LOADADDR too. Sorry for that.

After spotting this, I went and tried to find out how the change makes
things different because from my understanding, it should do the same
thing - assignment of current address where .data section begins should
be the same as assignment of address of the .data section.

Turns out there is a difference - the linker also generated a
.trampoline section and placed it between _eronly assignment and
beginning of .data section. My guess is that when this section is not
mentioned in the linker script, the linker has free reign as to where to
put it...? Anyway, that made _eronly value incorrect by .trampoline
section size and things went badly from there.

The issue can therefore also be fixed by this patch:

diff --git a/boards/avr/atmega/arduino-mega2560/scripts/flash.ld
b/boards/avr/atmega/arduino-mega2560/scripts/flash.ld
index f365646dfa..f9ceeb3a08 100644
--- a/boards/avr/atmega/arduino-mega2560/scripts/flash.ld
+++ b/boards/avr/atmega/arduino-mega2560/scripts/flash.ld
@@ -122,6 +122,12 @@ SECTIONS
          _etext = . ;
      } > flash

+    .trampolines     :
+    {
+        *(.trampolines)
+        *(.trampolines.*)
+    } > flash
+
      _eronly = ABSOLUTE(.);

      .data            :

The reason why I wasn't able to reproduce this on mega1284p-xplained is
then simple - there is no .trampolines section for that chip in the
compiled binary.

In my opinion, the proposed patch in #16457 is a better solution. While it does not preserve what the code comments say about the _eronly value ("this is where .text + .rodata ends"), it is a better match to what the
_eronly value is actually used for ("this is start of .data section")

On 2025-05-28 11:59, kr....@kerogit.eu wrote:
> Alright, I looked into the possible bug mentioned in the last paragraph
> of my previous message and it seems the code is actually correct. What
> I thought was that if register pair Z overflows, subsequent reads would
> read from incorrect memory locations. However, that does not seem to be
> the case - according to AVR Instruction Set Manual DS40002198A, ELPM
> instruction post-increment "applies to the entire 24-bit concatenation
> of the RAMPZ and Z-Pointer Registers"
>
> As for the change of _eronly however, I still think that is incorrect.
> Will try to look into that more during the evening unless someone
> figures something else out.
>
> What compiler are you using anyway? From the kernel version mentioned
> in GitHub I assume gcc 14.2? Might be the reason why everything works
> fine for me and not for anyone else, I am still on 5.4 from Debian
> stable. (Can try that too, Debian testing also has 14.2.)
>
> On 2025-05-28 09:16, kr....@kerogit.eu wrote:
>> Hello,
>>
>> I don't that is a correct fix and in fact, from a quick look at things
>> it seems to me that it actually introduces another bug. As described
>> in src/atmega/atmega_head.S, that value is supposed to be "Start of
>> .data section in FLASH". You changed it to end of .data section in
>> FLASH. It is used after __do_copy_data label as an address where
>> initialized data start in flash memory. So if I read this correctly,
>> you are copying incorrect data to your initialized variables.
>>
>> If the program starts working this way, that might suggest that the
>> initialization data are placed incorrectly in the image? Anyway, the
>> linker script is pretty much identical to linker script of
>> mega1284p-xplained and that one works out of the box without need to
>> move _eronly.
>>
>> Hm, looking at this, I think I just spotted a bug in __do_copy_data
>> but that would only hit if your initialized data region crossed a 64kB
>> boundary. Is that the case in any chance? It would cause _some_
>> initialized variables to contain random garbage...
>>
>> On 2025-05-28 07:17, Matteo Golin wrote:
>>> Managed to find a fix here:
>>> https://github.com/apache/nuttx/pull/16457
>>>
>>> On Tue, May 27, 2025 at 11:24 PM Matteo Golin
>>> <matteo.go...@gmail.com>
>>> wrote:
>>>
>>>> Thank you for your help KR!
>>>>
>>>> I also noticed this CONFIG_RAM_START=0x800100. Unfortunately,
>>>> changing
>>>> that value to the proper 0x800200 did not really help the issue.
>>>>
>>>> I went ahead and generated the disassembly like you suggested, and
>>>> it does
>>>> appear that the .text section starts with jump instructions for all
>>>> the
>>>> interrupt vectors:
>>>>
>>>> 00000000 <_stext>:
>>>>        0:       0c 94 72 00     jmp     0xe4    ; 0xe4 <__start>
>>>>        4:       0c 94 95 00     jmp     0x12a   ; 0x12a
>>>> <atmega_int0>
>>>>        8:       0c 94 98 00     jmp     0x130   ; 0x130
>>>> <atmega_int1>
>>>>        c:       0c 94 9b 00     jmp     0x136   ; 0x136
>>>> <atmega_int2>
>>>>       10:       0c 94 9e 00     jmp     0x13c   ; 0x13c
>>>> <atmega_int3>
>>>>       14:       0c 94 a1 00     jmp     0x142   ; 0x142
>>>> <atmega_int4>
>>>>       18:       0c 94 a4 00     jmp     0x148   ; 0x148
>>>> <atmega_int5>
>>>>       1c:       0c 94 a7 00     jmp     0x14e   ; 0x14e
>>>> <atmega_int6>
>>>>       20:       0c 94 aa 00     jmp     0x154   ; 0x154
>>>> <atmega_int7>
>>>>       24:       0c 94 ad 00     jmp     0x15a   ; 0x15a
>>>> <atmega_pcint0>
>>>>       28:       0c 94 b0 00     jmp     0x160   ; 0x160
>>>> <atmega_pcint1>
>>>>       2c:       0c 94 b3 00     jmp     0x166   ; 0x166
>>>> <atmega_pcint2>
>>>>       30:       0c 94 b6 00     jmp     0x16c   ; 0x16c <atmega_wdt>
>>>>       34:       0c 94 b9 00     jmp     0x172   ; 0x172
>>>> <atmega_tim2_compa>
>>>>       38:       0c 94 bc 00     jmp     0x178   ; 0x178
>>>> <atmega_tim2_compb>
>>>>       3c:       0c 94 bf 00     jmp     0x17e   ; 0x17e
>>>> <atmega_tim2_ovf>
>>>>       40:       0c 94 c2 00     jmp     0x184   ; 0x184
>>>> <atmega_tim1_capt>
>>>>       44:       0c 94 c5 00     jmp     0x18a   ; 0x18a
>>>> <atmega_tim1_compa>
>>>>       48:       0c 94 c8 00     jmp     0x190   ; 0x190
>>>> <atmega_tim1_compb>
>>>> ...and so on...
>>>>
>>>> `g_idle_topstack` is 0x800e2b in the disassembly. With RAM_START
>>>> being
>>>> 0x800200, and a size of 8192 bytes, that gives a `heap_size` of 5076
>>>> bytes.
>>>> That all seems fine.
>>>>
>>>> However, printing out the value of `g_idle_topstack` (in binary
>>>> using
>>>> `up_putc`) I can see that the value does not match. The printed
>>>> value is
>>>> "1000011001110110" while the binary representation of 0x800e2b is
>>>> "100000000000111000101011".
>>>> Printing out the `heap_size` tells me it's "1111100111001101", which
>>>> is
>>>> not 5076.
>>>>
>>>> It seems your suspicion about a mismatch is probably correct,
>>>> although
>>>> it's not due to a misplaced vector table it seems.
>>>>
>>>> I did, for fun, try enabling `CONFIG_DEBUG_OPT_UNUSED_SECTIONS` and
>>>> it did
>>>> remove the vectors at the start of the assembly file. That image
>>>> didn't
>>>> work.
>>>> Then I kept `CONFIG_DEBUG_OPT_UNUSED_SECTIONS` enabled and put the
>>>> `KEEP(*(.vectors))` suggestion in the linker script. Vectors are
>>>> kept in
>>>> the image, and `g_idle_topstack` changes to 0x800be4. This image
>>>> didn't
>>>> work either.
>>>>
>>>> That got me thinking if there was another section getting
>>>> unexpectedly
>>>> tossed, so I put `KEEP(...)` around basically everything in the
>>>> linker
>>>> script. That didn't produce a different result, and still had the
>>>> `g_idle_topstack` of 0x800e2b.
>>>>
>>>> I will continue to debug this, but if you have any other suggestions
>>>> please let me know. Thank you very much for all your help so far!
>>>>
>>>> Matteo
>>>>
>>>> On Tue, May 27, 2025 at 7:33 PM <kr....@kerogit.eu> wrote:
>>>>
>>>>> Thanks to you as well.
>>>>>
>>>>> I spotted an error in the config posted both in the issue #16444
>>>>> and PR
>>>>> #16443. It is the
>>>>>
>>>>> CONFIG_RAM_START=0x800100
>>>>>
>>>>> Unless I overlooked something, the RAM starts at address 0x200
>>>>> (0x800200) for AtMega 2560.
>>>>>
>>>>> Not sure if that's what causing the problem though - as far as I
>>>>> can
>>>>> see, that value is actually only used as an input to CONFIG_RAM_END
>>>>> calculation and incorrect value in this macro simply reduces heap
>>>>> size.
>>>>>
>>>>> The symptoms of the bug would match though - the program would be
>>>>> trying
>>>>> to use I/O registers as RAM.
>>>>>
>>>>> Thing is - seeing the bug was narrowed down to heap initialization
>>>>> in
>>>>> nx_start - I remember that I ran into pretty much the same issue
>>>>> when
>>>>> adding support for AVR DA/DB family... and I can't remember what
>>>>> was the
>>>>> problem.
>>>>>
>>>>> Maybe check the disassembly file? (RAW_DISASSEMBLY needs to be set
>>>>> for
>>>>> the file to be generated.) Verify that the .text section starts
>>>>> with jmp
>>>>> instructions for interrupt vectors. This is only a vague memory but
>>>>> I
>>>>> think that I ran into the issue because the linker removed .vectors
>>>>> section, thinking it was unreferenced. (But that should not happen
>>>>> here,
>>>>> the linked config shows CONFIG_DEBUG_OPT_UNUSED_SECTIONS is not
>>>>> set.)
>>>>>
>>>>> Anyway, when that happened, the whole program got "shifted" -
>>>>> except for
>>>>> program memory that stored values of initialized variables. The
>>>>> data
>>>>> copy loop in initialization code (atmega_head.S) then read the
>>>>> values
>>>>> from the shifted location leading to a mismatch - variables were
>>>>> populated with wrong data. One of those variables was
>>>>> g_idle_topstack
>>>>> which directly sets start of heap. I think it initialized it to
>>>>> zero...?
>>>>>
>>>>> Again, all of this is based on vague memory. TLDR version is: if
>>>>> able,
>>>>> check g_idle_topstack in up_allocate_heap and verify that it is
>>>>> correct.
>>>>>
>>>>>
>>>>> On 2025-05-27 15:28, Alan C. Assis wrote:
>>>>> > Thank you KR,
>>>>> >
>>>>> > I posted your message in that issue for reference.
>>>>> >
>>>>> > I have a BK-AVR128 board:
>>>>> > https://aliexpress.com/item/1005006234334573.html
>>>>> > and will test NuttX on it to confirm.
>>>>> >
>>>>> > If everything is working as expected I will submit the board
support to
>>>>> > the
>>>>> > mainline.
>>>>> >
>>>>> > BR,
>>>>> >
>>>>> > Alan
>>>>> >
>>>>> > On Tue, May 27, 2025 at 6:39 AM <kr....@kerogit.eu> wrote:
>>>>> >
>>>>> >> Hello,
>>>>> >>
>>>>> >> as mentioned in one of my previous e-mails to this list (RFC:
>>>>> >> decoupling
>>>>> >> ability to always panic from board_reset), I noticed #16444 and
#16443
>>>>> >> on GitHub.
>>>>> >>
>>>>> >> I found some time to look into this and I believe I found the
reason
>>>>> >> for
>>>>> >> "If you add avr_lowputc calls in the board initialization code,
you'll
>>>>> >> see that the TX LED stays stuck on indefinitely."
>>>>> >>
>>>>> >> Provided the board initialization code that sentence is talking
about
>>>>> >> is
>>>>> >> atmega_boardinitialize() in avr_boot.c, then avr_lowputc() likely
does
>>>>> >> not work because of the configuration used. According to #16444,
the
>>>>> >> configuration has:
>>>>> >>
>>>>> >> CONFIG_DEV_CONSOLE=y
>>>>> >> CONFIG_CONSOLE_SYSLOG=y
>>>>> >>
>>>>> >> These two are processed in src/atmega/atmega_config.h
>>>>> >>
>>>>> >> #ifndef CONFIG_DEV_CONSOLE
>>>>> >> #  undef  USE_SERIALDRIVER
>>>>> >> #  undef  USE_EARLYSERIALINIT
>>>>> >> #else
>>>>> >> #  if defined(CONFIG_CONSOLE_SYSLOG)
>>>>> >> #    undef  USE_SERIALDRIVER
>>>>> >> #    undef  USE_EARLYSERIALINIT
>>>>> >> #  elif defined(HAVE_USART_DEVICE)
>>>>> >> #    define USE_SERIALDRIVER 1
>>>>> >> #    define USE_EARLYSERIALINIT 1
>>>>> >> #  else
>>>>> >> #    undef  USE_SERIALDRIVER
>>>>> >> #    undef  USE_EARLYSERIALINIT
>>>>> >> #  endif
>>>>> >> #endif
>>>>> >>
>>>>> >> With the configuration above, the outer ifndef is not true and
first
>>>>> >> if
>>>>> >> defined in else block is true, which results into:
>>>>> >>
>>>>> >> undef  USE_SERIALDRIVER
>>>>> >> undef  USE_EARLYSERIALINIT
>>>>> >>
>>>>> >> Since CONFIG_STANDARD_SERIAL is also set, the undef of
>>>>> >> USE_SERIALDRIVER
>>>>> >> is reverted by define USE_SERIALDRIVER 1 later. However,
>>>>> >> USE_EARLYSERIALINIT remains unset.
>>>>> >>
>>>>> >> This causes avr_earlyserialinit() to not be built nor called from
>>>>> >> avr_lowinit(). Serial port peripheral is therefore not
initialized yet
>>>>> >> when atmega_boardinitialize() is called. I don't know what exactly
>>>>> >> happens when you attempt to transmit data with the port not
enabled
>>>>> >> but
>>>>> >> my guess would be that "transmit data register empty" status flag
is
>>>>> >> just never cleared and the program ends up in a loop waiting for
that
>>>>> >> to
>>>>> >> happen.
>>>>> >>
>>>>> >>
>>>>> >> Other than that - I recently tested NSH on mega1284p-xplained
(well, a
>>>>> >> breadboard with the chip stuck in it actually) and it worked for
me.
>>>>> >> As
>>>>> >> far as I can see, all AtMega devices use the same code for
managing
>>>>> >> serial ports so it should work out of the box.
>>>>> >>
>>>>> >> Someone somewhere in some forum on the net has or had a footer in
his
>>>>> >> posts saying something along the lines of that non-functional
serial
>>>>> >> port is 99% mismatching baud rates, might be worth a re-check.
>>>>> >>
>>>>> >> As for the PR itself (copying from e-mail mentioned at the
beginning)
>>>>> >> -
>>>>> >> I would recommend trying to use KEEP(*(.vectors)) as seen in
>>>>> >> boards/avr/avrdx/breadxavr/scripts/breadxavr.ld - the default
config
>>>>> >> should then not need the "# CONFIG_DEBUG_OPT_UNUSED_SECTIONS is
not
>>>>> >> set"
>>>>> >> line.
>>>>> >>
>>>>>
>>>>

Reply via email to