On 4/21/25 2:48 AM, Stephane Chazelas wrote:
2025-04-20 17:31:56 -0400, Chet Ramey: [...]This has been fixed since last July, and the fix is in bash-5.3.[...]Thanks, though as Greg says, there seems to be a few more related issues still affecting 5.3. I repost a message sent privately below now that the discussion has been extended to the mailing list.The bug concerns unicode combining characters introducing invalid unicode character sequences that happen to contain the delimiter, and was reported privately.[...] That sentence doesn't seem to make sense to me.
Say you read a byte that introduces an (incomplete) multibyte character (mbrtowc returns -2). Then you read the delimiter character, which changes the incomplete multibyte character into an invalid one. Instead of adding each byte of the invalid multibyte character to the input string you're building, you need to perform the delimiter check against the final character. The original bug report happened to reproduce this entirely using combining characters. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU c...@case.edu http://tiswww.cwru.edu/~chet/
OpenPGP_signature.asc
Description: OpenPGP digital signature