Sorry, bashbug didn't work under cygwin... BASH_VERSION=4.4.12(3)-release uname -a: CYGWIN_NT-6.1 xxxxxxx 2.8.0(0.309/5/3) 2017-04-01 20:47 x86_64 Cygwin
The function u32toutf16() in lib/sh/unicode.c incorrectly implements surrogate pairs. \uff08 (Full Width Left Paren) is encoded to the invalid surrogate pair d7ff df08. Unicode code points in the range 0xe000-0xffff should be encoded as a single 16-bit code unit. To repeat (Windows 64-bit, cygwin): export LANG=en_us.UTF-8 echo $'\uff08' | hexdump -C This prints: 00000000 ed 9f bf ed bc 88 0a |.......| 00000007 This is UTF-8 encoding for the two 16-bit values 0xdf77 0xdf08. This is invalid as a UTF-8 encoding, surrogate pairs should not be UTF-8 encoded. The fix is simple, add tests for the e000-ffff range, or invert the test order and add a test for dfff (CAVEAT EMPTOR! THIS IS UNTESTED!): if (c >= 0x010000 && c <= 0x010ffff) { c -= 0x010000; s[0] = (unsigned short)((c >> 10) + 0xd800); s[1] = (unsigned short)((c & 0x3ff) + 0xdc00); l = 2; } else if (c < 0x0d800 || c > 0xdfff ) { s[0] = (unsigned short) (c & 0xFFFF); l = 1; } a.