On 4/20/25 6:58 PM, Greg Wooledge wrote:
That one may be fixed, but:bash-5.3$ printf 'FOO\0\315\0\226\0' | while IFS= read -rd '' f; do printf '<%q>\n' "$f"; done <FOO> <$'\315'> <''> <''> The context for all of this was someone in IRC who was reading a chunk of data from /dev/urandom and got different results with LC_CTYPE=C vs. LC_CTYPE=en_US.utf8 (or other UTF-8 locale). This is a simplified reproducer. In real-life scripts, this kind of thing could arise if someone reads a NUL-delimited stream of pathnames from find -print0, or equivalent.
Yes, thanks for the report. The failure cases are somewhat constrained and limited to invalid multibyte characters immediately followed by the delimiter. I'll fix it for the next devel branch push and this will be a part of bash-5.3-rc2. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU c...@case.edu http://tiswww.cwru.edu/~chet/
OpenPGP_signature.asc
Description: OpenPGP digital signature