RFC - some AVR (and other) fixes

kr . git Thu, 10 Apr 2025 01:55:05 -0700

Hello,

I would like to solicit a review for some fixes of (mostly) AVR architecture code. These are some things I noticed when attempting to create a port for AVR DA/DB family. I am doing my development based on ATmega family since I am familiar with that and these patches (should) only apply to it. I tried to avoid any changes that would affect the other families but I cannot do any testing to confirm these patches don't break anything for them.

I tested these patches with ATmega1284 - since most of them affect context switch code, I made a simple stress application that spawns 4 tasks. First reads data from serial port, stores-and-forwards them through tasks 2 and 3 and finally to a task 4 which writes them back to serial port to be verified for integrity by the PC side. As long as no new data are arriving, currently held data is constantly shuffled through registers and stack. None of the tasks ever sleep, forcing as many context switches as possible. I ran this application for a few hours and no data corruption was observed.

Please note that I do not have a Github account so, if these patches are accepted, I will be relying on someone else picking them up and doing a PR.

I will attempt to post these patches through the mailing list, there is also a git repository nuttx.git available at git.kerogit.eu through HTTP/S . The branch with the changes is called avr_fixes_rfc1 and is based on upstream's releases/12.9 . It also rebases cleanly onto current master if that is the more preferred location where to put these.


Description of the patches follows:

=== 0001 fixed function name ===

Mistyped function name in Documentation resulting in something not present in the sources.


=== 0002 fixed typo in Kconfig ===

Missing letter.

=== 0003 save RAMPZ register into thread context ===

RAMPZ is an extension register used when accessing program memory (flash) from the program itself. It is concatenated with value in pointer register pair Z. Since user applications can reasonably be expected to load their data from program memory, this register should be saved into the thread context as well.

That said, it's quite unlikely to hit the bug (reading from incorrect address) fixed by this patch - there are three conditions that need to be met:

- forced context switch of a task that is currently reading from program memory - any other tasks that receives CPU time also needs to read from program memory - program memory address of those tasks must differ in RAMPZ value, which is the highest byte

Meeting the last condition is somewhat unlikely because all data stored in program memory tends to be placed in a single region by the compiler. So to trigger the bug, this region would need to be large enough to cross 64kB boundary. Considering most supported devices have 128kB of flash in total and NuttX itself needs close to 64kB with very few features enabled, it is quite unlikely to have this much const data and NuttX and still an useful program.

Still feels worth fixing though, impact on existing users should be negligible - few bytes of memory spent and few extra instructions. The fix also applies to ATmega2560 with 256kB of flash where it is more likely to have a program that can actually hit this.

Note that similar bug still exists in ATmega2560 with its EIND register. This register is used to extend address register pair used for making indirect jumps (making jumps to addresses beyond 128kB boundary possible.) This is not addressed by this patch as I don't have such chip to be able to test the change.


=== 0004 flag microcontrollers that have RAMPZ register ===

Enables code from previous patch on all currently supported chips.

=== 0005 replace AVR __flash qualifier with __memx for some chips ===

This one may need some feedback. This patch changes __flash qualifier to __memx in IOBJ macro. These qualifiers instruct the compiler to place (or, rather, keep) const data in program memory and read it from there. The difference is that __flash is only able to address 64kB memory size while __memx extends the pointer size to 24 bits. The (possible) issue here is that this definition of IOBJ is currently restricted to compilation with AVR toolchain which I don't use and therefore can't test. However, I believe that their toolchain is based on gcc (which I use) and therefore behaves the same. At the minimum, the __memx qualifier is supported there for sure as it is the value used in IPTR macro already present in the code.

Anyway, with this changes and with corresponding configuration options enabled, the code compiles and - judging from the assembly - seems to do the right thing (ie. does respect the __memx qualifier from IPTR macro used in parameters for syslog and syslog-related functions.)

Using __flash qualifier as it is used now may cause bugs when const data is placed to memory regions above 64kB. As of now, compiler tends to put them at the beginning of the .text section so it is again unlikely to happen - as long as the data isn't large enough. And again, might be worth fixing since impact on existing users should be negligible. For starters, this is currently only used for debugging code and the associated cost of using 24 bit pointers is already paid in current form because of IPTR being set to __memx

This change does not apply for devices with 64kB or smaller program memory (although there is no such device currently supported.)


=== 0006 fixed PC load on chips with more than 128kB program memory ===

The original code uses Y pointer register pair to load 3 bytes from given memory address. However, using subi (SUBtract Immediate) instruction here is incorrect as it only affects the register it is called upon - YL, low byte from the Y register pair. High byte remains unchanged, resulting in incorrect load if the lower byte value wraps.

The correct instruction for this would be sbiw (SuBtract Immediate from Word) which operates on the whole pair. The fix does not do that though - instead, the Y register pair is preloaded with value incremented by 1 and loads are executed using pre-decrement in the load instructions. This both fixes the bug and also saves a bit of CPU time. The value in Y register then ends up being different compared to the original code but that should not matter since - as far as I can see - it is not used and is later overwritten.

All the feedback and help is appreciated and my apologies for this text being this long

0001-Documentation-reference-user-02_task_scheduling-fixe.patch.gz
Description: application/gzip

0002-arch-avr-fixed-typo-in-Kconfig.patch.gz
Description: application/gzip

0003-arch-avr-save-RAMPZ-register-into-thread-context.patch.gz
Description: application/gzip

0004-arch-avr-atmega-flag-microcontrollers-that-have-RAMP.patch.gz
Description: application/gzip

0005-nuttx-compiler-replace-AVR-__flash-qualifier-with-__.patch.gz
Description: application/gzip

0006-arch-avr-fixed-PC-load-on-chips-with-more-than-128kB.patch.gz
Description: application/gzip

RFC - some AVR (and other) fixes

Reply via email to