Hello,

thanks. Here is a response to xiaoxiang781216's review:

could we extract a common typedef to avoid #ifdef/#else/#endif spread the code base?

Can do - sbuf_size_t seems unused throughout the whole tree so I'll add that to serial.h.

can we change to ARCH_HAVE_8BIT_BUFSIZE and put into serial/Kconfig?

I am not sure if that means to a) change the name from ARCH_LD_16BIT_NOT_ATOMIC to ARCH_HAVE_8BIT_BUFSIZE or b) add ARCH_HAVE_8BIT_BUFSIZE and have it selected by ARCH_LD_16BIT_NOT_ATOMIC? Neither seems fully correct though.

The first variant reduces the problem to buffers only but that is too narrow. Essentially, any other value that is simultaneously written in interrupt code and read in non-interrupt code needs the be treated in the same manner as buffer head/tail here. The current name reflects that. Also, adding ARCH_LD_32BIT_NOT_ATOMIC may be needed as well (possibly not only for AVR but for 16 bit microcontrollers too - if I understand it correctly, that applies to Freescale M9S12.)

The second variant implies that all of the arch has 8 bit buffer size for all buffers everywhere which is not true and may not be desirable.

How about adding ARCH_HAVE_8BIT_SERIAL_BUFSIZE to serial/Kconfig and have it selected by ARCH_LD_16BIT_NOT_ATOMIC?

could we directly change the default value to 255

Can be done but that would affect other architectures as well which is something I wanted to avoid. Also, using 255 feels dangerous when the underlying type is uint8_t. Might lead to bugs where expression like (x + 1 > size) never evaluates to true. Not that I see any such code in current sources, but if the default value should change for all architectures, I would rather use 252 out of abundance of caution.

move to help of ARCH_LD_16BIT_NOT_ATOMIC

I think it would be better to put something more generic in there. Additionally, I'll copy the comment about the buffer size to help for USARTn_RX/TXBUFSIZE.

why not merge to previous patch

Because it fixes different bug (on all architectures.) I will reorder the patches to make that more obvious and to have the less intrusive one be the first in case it turns out that the second one needs to be reverted.

On 2025-06-09 15:10, alin.jerpe...@sony.com wrote:
Hi KR,

the patches have been updated
https://github.com/apache/nuttx/pull/16466

Best regads
Alin

________________________________
Från: kr....@kerogit.eu <kr....@kerogit.eu>
Skickat: den 9 juni 2025 12:12
Till: dev@nuttx.apache.org <dev@nuttx.apache.org>
Ämne: Re: Sv: RFC: fix race conditions in drivers/serial/serial.c

Hello, reworked version of the series is in branch uart_fixes_rfc3 and also attached. This version incorporates xiaoxiang781216's suggestion on GitHub to simply make the buffer size stored in uint8_t variable if the architecture cannot handle


Hello,

reworked version of the series is in branch uart_fixes_rfc3 and also
attached. This version incorporates xiaoxiang781216's suggestion on
GitHub to simply make the buffer size stored in uint8_t variable if the
architecture cannot handle 16-bit loads atomically.

Fix for TOCTOU in watermark calculation was split into a separate patch
since it addresses a different issue.

Testing was done in the same way as the original version of the series.

On 2025-06-05 07:44, kr....@kerogit.eu wrote:
Hello,

based on the feedback in GitHub review, I will redo the patch
differently and reduce the size of the circular buffers so the
variables fit into uint8_t.

To respond to xiaoxiang781216's comment about using atomic_xxx API
instead the critical section - I was considering that as well but in
the end, I concluded that the critical section is highly unlikely to be
compiled in on any architecture that supports SMP. It does set a bad
example though, that did not occur to me previously.

The atomic_xxx API I decided not to use because when I tried it, it
produced buggy code. I will post a patch/question regarding that in a
separate thread.

Nevertheless, if reads are made atomic, neither of the above is needed.

Thanks for the feedback, will try to do the rework in next few days.

On 2025-06-02 18:58, kr....@kerogit.eu wrote:
Hello,

I noticed a review from Allan C. Assis about a spelling error. Updated
patches are attached and also available in repository in
uart_fixes_rfc2 branch. (Sorry about the extra work, I was even trying
to check the word using web search... but I didn't spot that the
search autocorrected it so my conclusion was that the spellchecker I
also used is just tripping on the plural.)

Best regards

On 2025-06-02 09:39, alin.jerpe...@sony.com wrote:
Hi KR,

thank for submitting the patches

They have been uploaded to mainline and are under review
https://urldefense.com/v3/__https://github.com/apache/nuttx/pull/16466__;!!O7_YSHcmd9jp3hj_4dEAcyQ!yLV1nobwBFiEr5gCY_xNYvnCqhLJeSSyqnb6uJgKkEsQ6_1NJtMiI_zeeQRnjWuYSysKvDDj5FMrky7Yhq8$


Best regards
Alin
________________________________
Från: kr....@kerogit.eu <kr....@kerogit.eu>
Skickat: den 1 juni 2025 22:42
Till: dev@nuttx.apache.org <dev@nuttx.apache.org>
Ämne: RFC: fix race conditions in drivers/serial/serial.c

While going through the code in drivers/serial/serial. c, I noticed
this comment: The head and tail pointers are 16-bit values. The only
time that the following could be unsafe is if the CPU made two
non-atomic 8-bit accesses to obtain the 16-bit


While going through the code in drivers/serial/serial.c, I noticed
this
comment:

The head and tail pointers are 16-bit values.  The only time that
the following could be unsafe is if the CPU made two non-atomic
8-bit accesses to obtain the 16-bit head index.

This is what happens for (at least) AVR architecture. These are 8bit
microcontrollers and as such, they will fetch the 16-bit value in two
8-bit load instructions; interrupt routine may execute between those
and
change the value being read. This will result in corrupted read (one
byte of the value will be from pre-interrupt state and the other from
post-interrupt state.)

This patch introduces CONFIG_ARCH_LD_16BIT_NOT_ATOMIC configuration
option, which is automatically selected for architectures known to be unable to load 16bit values in one instruction. (It is currently only
set for AVR. I presume it might be also useful for Z80 but I do not
have
any experience with that architecture so I did no changes there.)
When
this configuration option is set, reads of recv.head and xmit.tail
are
enclosed with enter_critical_section and leave_critical_section calls
to
prevent interrupt handlers from running, if needed. Not all reads
need
to be protected this way - some are already in existing critical
sections and some happen with the UART-specific interrupt disabled.

If the configuration option is not set, the code simply loads the
value
into a local variable. Subsequent direct uses of the unprotected
volatile variable are replaced with the local variables for both
cases.

There is also a related change that only applies when
CONFIG_SERIAL_IFLOWCONTROL_WATERMARKS is set. Aside from the
protection
selected by CONFIG_ARCH_LD_16BIT_NOT_ATOMIC, this patch also fixes
calculation of the nbuffered value. This calculation is not running
with
interrupts disabled and value of rxbuf->head can change between the
condition and actual computation. Even if the load itself is atomic,
this leads to TOCTOU error.

Impact: this change should not impact architectures that do not
benefit
from this change at all unless CONFIG_SERIAL_IFLOWCONTROL is set. If
CONFIG_SERIAL_IFLOWCONTROL is set, the only change that remains in
effect is the fix of the TOCTOU error.

Testing: Patch was tested with rv-virt:nsh - disassembly of functions
from drivers/serial/serial.c was compared and did not change after
the
patch was applied - with one exception. The exception was an address
of
a global variable, I assume it was caused by change of g_version
length
(which notes that the tree is dirty) and is therefore
inconsequential.
CONFIG_SERIAL_IFLOWCONTROL was not set in this test.

Configuration with CONFIG_SERIAL_IFLOWCONTROL set was tested by
building
it for AVR DA/DB. Configuration with CONFIG_SERIAL_IFLOWCONTROL unset was tested by custom echo application running on AVR128DA28 for a few
hours.

I also ran CI test: 1044 passed, 10 skipped, 14 deselected, 3
warnings
in 2463.23s (0:41:03)


On a related note, it seems to me that there may be a bug in echo
handling in uart_readv. If dev->tc_lflag & ECHO is true, then
uart_putxmitchar is called. This function adds the received character
to
transmit buffer and increments xmit.head value. However,
uart_putxmitchar is also called from uart_writev; this function
states
"Only one user can access dev->xmit.head at a time" and takes the
dev->xmit.lock. As far as I can see, uart_readv does not take the
lock
and may interfere with uart_writev. Evaluating and solving this
properly
is currently above my skillset though. (If this is even considered
serious enough to warrant a fix.)


The patch series also fixes a typo in drivers/serial/serial.c


Both patches are attached to this message and also available in a git
repository nuttx.git at git.kerogit.eu accessible through HTTP/S.
(Trying to prevent bot traffic by not posting the URL in
machine-readable form.) The relevant branch is called
uart_fixes_rfc1.
If the patches are acceptable, I would like to ask someone with
GitHub
account to open a PR (I don't have a GH account.) Any comments or
suggestions are welcome.

Reply via email to