On 4/20/25 6:58 PM, Greg Wooledge wrote:

That one may be fixed, but:

bash-5.3$ printf 'FOO\0\315\0\226\0' | while IFS= read -rd '' f; do printf '<%q>\n' 
"$f"; done
<FOO>
<$'\315'>
<''>
<''>

The context for all of this was someone in IRC who was reading a chunk
of data from /dev/urandom and got different results with LC_CTYPE=C vs.
LC_CTYPE=en_US.utf8 (or other UTF-8 locale).  This is a simplified
reproducer.

In real-life scripts, this kind of thing could arise if someone reads
a NUL-delimited stream of pathnames from find -print0, or equivalent.

Yes, thanks for the report. The failure cases are somewhat constrained and
limited to invalid multibyte characters immediately followed by the
delimiter. I'll fix it for the next devel branch push and this will be a
part of bash-5.3-rc2.

Chet

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    c...@case.edu    http://tiswww.cwru.edu/~chet/

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to