Hello,finally got to it, 4th revision of the patches is in branch named uart_fixes_rfc4 and also attached. This revision incorporates suggestions from GitHub review.
1. The order of patches in the series is reversed (not counting the typo fix patch.) The less intrusive one is now placed before the other. This way, the smaller fix is not affected by changes that follow, namely by using the new data type.
2. Instead of ifdefs in various places, the patch introduces type sbuf_size_t in include/nuttx/serial/serial.h . Buffer size, head and tail variables are changed to this type.
3. ARCH_LD_16BIT_NOT_ATOMIC configuration option is renamed to ARCH_LDST_16BIT_NOT_ATOMIC to denote that store operations are not atomic as well for the architecture.
4. Default serial buffer size is changed from 256 to 252. Additionally, help of configuration options for USARTn_(RX|TX)BUFSIZE now mentions that if the architecture is unable to load/store 16-bit values atomically, the size of buffer is stored in uint8_t.
5. New configuration option ARCH_HAVE_SERIAL_8BIT_BUFSIZE is added. This configuration option triggers change of sbuf_size_t from default int16_t to architecture-specific uint8_t. It is, however, not selected from AVR serial driver Kconfig option as the review suggests. Instead, it is selected by ARCH_LDST_16BIT_NOT_ATOMIC. The reasoning for doing it this way is that AVR is (likely) not the only architecture suffering from this race condition - any 8-bit architecture is, unless it does 16-bit loads/stores atomically. As far as I was able to find, this applies to Z80 as well.
Also, serial buffer is likely not the only area that introduces such a race. If more such cases are found, selecting all configuration options that enable measures that prevent these race conditions from all affected architectures would lead to M x N configuration options and all architectures would need to track new options being added. Selecting all of them from single point (ie. from ARCH_LDST_16BIT_NOT_ATOMIC) makes the configuration easier, the architecture only needs to declare what is is (in)capable of and all related options will be enabled automatically.
As for the comment from jlaitine - I also find it strange that the original index is a signed value and I found no reason for it being that way. And yes, it is left untouched to prevent any changes in other architectures.
If there is still something left to improve, let me know. Thanks. On 2025-06-11 13:42, alin.jerpe...@sony.com wrote:
Hi KR, I added your comments Best regards Alin ________________________________ Från: kr....@kerogit.eu <kr....@kerogit.eu> Skickat: den 11 juni 2025 07:54 Till: dev@nuttx.apache.org <dev@nuttx.apache.org> Ämne: Re: Sv: Sv: RFC: fix race conditions in drivers/serial/serial.cHello, thanks. Here is a response to xiaoxiang781216's review: > could we extract a common typedef to avoid #ifdef/#else/#endif spread > the code base? Can do - sbuf_size_t seems unused throughout the whole tree so I'll add that to serial. h.Hello, thanks. Here is a response to xiaoxiang781216's review:could we extract a common typedef to avoid #ifdef/#else/#endif spread the code base?Can do - sbuf_size_t seems unused throughout the whole tree so I'll add that to serial.h.can we change to ARCH_HAVE_8BIT_BUFSIZE and put into serial/Kconfig?I am not sure if that means to a) change the name from ARCH_LD_16BIT_NOT_ATOMIC to ARCH_HAVE_8BIT_BUFSIZE or b) addARCH_HAVE_8BIT_BUFSIZE and have it selected by ARCH_LD_16BIT_NOT_ATOMIC?Neither seems fully correct though. The first variant reduces the problem to buffers only but that is too narrow. Essentially, any other value that is simultaneously written in interrupt code and read in non-interrupt code needs the be treated in the same manner as buffer head/tail here. The current name reflects that. Also, adding ARCH_LD_32BIT_NOT_ATOMIC may be needed as well (possibly not only for AVR but for 16 bit microcontrollers too - if I understand it correctly, that applies to Freescale M9S12.) The second variant implies that all of the arch has 8 bit buffer size for all buffers everywhere which is not true and may not be desirable. How about adding ARCH_HAVE_8BIT_SERIAL_BUFSIZE to serial/Kconfig and have it selected by ARCH_LD_16BIT_NOT_ATOMIC?could we directly change the default value to 255Can be done but that would affect other architectures as well which is something I wanted to avoid. Also, using 255 feels dangerous when the underlying type is uint8_t. Might lead to bugs where expression like (x + 1 > size) never evaluates to true. Not that I see any such code in current sources, but if the default value should change for all architectures, I would rather use 252 out of abundance of caution.move to help of ARCH_LD_16BIT_NOT_ATOMICI think it would be better to put something more generic in there. Additionally, I'll copy the comment about the buffer size to help for USARTn_RX/TXBUFSIZE.why not merge to previous patchBecause it fixes different bug (on all architectures.) I will reorderthe patches to make that more obvious and to have the less intrusive onebe the first in case it turns out that the second one needs to be reverted. On 2025-06-09 15:10, alin.jerpe...@sony.com wrote:Hi KR, the patches have been updated https://urldefense.com/v3/__https://github.com/apache/nuttx/pull/16466__;!!O7_YSHcmd9jp3hj_4dEAcyQ!1y3RB5EUXoBosi8vAEcjSyRXMCiRnRtrGSfJEO84MiLP3anDf6k6445X2RlUvrB55u8lzti4-TaTVNIzppE$ Best regads Alin ________________________________ Från: kr....@kerogit.eu <kr....@kerogit.eu> Skickat: den 9 juni 2025 12:12 Till: dev@nuttx.apache.org <dev@nuttx.apache.org> Ämne: Re: Sv: RFC: fix race conditions in drivers/serial/serial.c Hello, reworked version of the series is in branch uart_fixes_rfc3 and also attached. This version incorporates xiaoxiang781216's suggestion on GitHub to simply make the buffer size stored in uint8_t variable if the architecture cannot handle Hello, reworked version of the series is in branch uart_fixes_rfc3 and also attached. This version incorporates xiaoxiang781216's suggestion onGitHub to simply make the buffer size stored in uint8_t variable if thearchitecture cannot handle 16-bit loads atomically.Fix for TOCTOU in watermark calculation was split into a separate patchsince it addresses a different issue.Testing was done in the same way as the original version of the series.On 2025-06-05 07:44, kr....@kerogit.eu wrote:Hello, based on the feedback in GitHub review, I will redo the patch differently and reduce the size of the circular buffers so the variables fit into uint8_t. To respond to xiaoxiang781216's comment about using atomic_xxx API instead the critical section - I was considering that as well but in the end, I concluded that the critical section is highly unlikely to be compiled in on any architecture that supports SMP. It does set a bad example though, that did not occur to me previously. The atomic_xxx API I decided not to use because when I tried it, it produced buggy code. I will post a patch/question regarding that in a separate thread. Nevertheless, if reads are made atomic, neither of the above is needed. Thanks for the feedback, will try to do the rework in next few days. On 2025-06-02 18:58, kr....@kerogit.eu wrote:Hello, I noticed a review from Allan C. Assis about a spelling error. Updated patches are attached and also available in repository in uart_fixes_rfc2 branch. (Sorry about the extra work, I was even trying to check the word using web search... but I didn't spot that the search autocorrected it so my conclusion was that the spellchecker I also used is just tripping on the plural.) Best regards On 2025-06-02 09:39, alin.jerpe...@sony.com wrote:Hi KR, thank for submitting the patches They have been uploaded to mainline and are under review https://urldefense.com/v3/__https://github.com/apache/nuttx/pull/16466__;!!O7_YSHcmd9jp3hj_4dEAcyQ!yLV1nobwBFiEr5gCY_xNYvnCqhLJeSSyqnb6uJgKkEsQ6_1NJtMiI_zeeQRnjWuYSysKvDDj5FMrky7Yhq8$ Best regards Alin ________________________________ Från: kr....@kerogit.eu <kr....@kerogit.eu> Skickat: den 1 juni 2025 22:42 Till: dev@nuttx.apache.org <dev@nuttx.apache.org> Ämne: RFC: fix race conditions in drivers/serial/serial.c While going through the code in drivers/serial/serial. c, I noticedthis comment: The head and tail pointers are 16-bit values. The onlytime that the following could be unsafe is if the CPU made two non-atomic 8-bit accesses to obtain the 16-bit While going through the code in drivers/serial/serial.c, I noticed this comment: The head and tail pointers are 16-bit values. The only time that the following could be unsafe is if the CPU made two non-atomic 8-bit accesses to obtain the 16-bit head index.This is what happens for (at least) AVR architecture. These are 8bitmicrocontrollers and as such, they will fetch the 16-bit value in two8-bit load instructions; interrupt routine may execute between thoseandchange the value being read. This will result in corrupted read (onebyte of the value will be from pre-interrupt state and the other from post-interrupt state.) This patch introduces CONFIG_ARCH_LD_16BIT_NOT_ATOMIC configuration option, which is automatically selected for architectures known to be unable to load 16bit values in one instruction. (It is currently only set for AVR. I presume it might be also useful for Z80 but I do not have any experience with that architecture so I did no changes there.) When this configuration option is set, reads of recv.head and xmit.tail are enclosed with enter_critical_section and leave_critical_section calls to prevent interrupt handlers from running, if needed. Not all reads need to be protected this way - some are already in existing critical sections and some happen with the UART-specific interrupt disabled. If the configuration option is not set, the code simply loads the value into a local variable. Subsequent direct uses of the unprotected volatile variable are replaced with the local variables for both cases. There is also a related change that only applies when CONFIG_SERIAL_IFLOWCONTROL_WATERMARKS is set. Aside from the protection selected by CONFIG_ARCH_LD_16BIT_NOT_ATOMIC, this patch also fixes calculation of the nbuffered value. This calculation is not running with interrupts disabled and value of rxbuf->head can change between thecondition and actual computation. Even if the load itself is atomic,this leads to TOCTOU error. Impact: this change should not impact architectures that do not benefitfrom this change at all unless CONFIG_SERIAL_IFLOWCONTROL is set. IfCONFIG_SERIAL_IFLOWCONTROL is set, the only change that remains in effect is the fix of the TOCTOU error. Testing: Patch was tested with rv-virt:nsh - disassembly of functions from drivers/serial/serial.c was compared and did not change after thepatch was applied - with one exception. The exception was an addressof a global variable, I assume it was caused by change of g_version length (which notes that the tree is dirty) and is therefore inconsequential. CONFIG_SERIAL_IFLOWCONTROL was not set in this test. Configuration with CONFIG_SERIAL_IFLOWCONTROL set was tested by building it for AVR DA/DB. Configuration with CONFIG_SERIAL_IFLOWCONTROL unset was tested by custom echo application running on AVR128DA28 for a few hours. I also ran CI test: 1044 passed, 10 skipped, 14 deselected, 3 warnings in 2463.23s (0:41:03) On a related note, it seems to me that there may be a bug in echo handling in uart_readv. If dev->tc_lflag & ECHO is true, then uart_putxmitchar is called. This function adds the received character to transmit buffer and increments xmit.head value. However, uart_putxmitchar is also called from uart_writev; this function states "Only one user can access dev->xmit.head at a time" and takes the dev->xmit.lock. As far as I can see, uart_readv does not take the lock and may interfere with uart_writev. Evaluating and solving this properly is currently above my skillset though. (If this is even considered serious enough to warrant a fix.) The patch series also fixes a typo in drivers/serial/serial.c Both patches are attached to this message and also available in a git repository nuttx.git at git.kerogit.eu accessible through HTTP/S. (Trying to prevent bot traffic by not posting the URL in machine-readable form.) The relevant branch is called uart_fixes_rfc1. If the patches are acceptable, I would like to ask someone with GitHub account to open a PR (I don't have a GH account.) Any comments or suggestions are welcome.
0001-drivers-serial-serial-typo-fix-in-comment.patch.gz
Description: application/gzip
0002-drivers-serial-serial-fix-race-condition-in-flow-con.patch.gz
Description: application/gzip
0003-drivers-serial-serial-prevent-race-conditions-on-8-b.patch.gz
Description: application/gzip