Chet Ramey (<chet.ra...@case.edu>) wrote: > On 11/24/18 4:32 PM, Bize Ma wrote:
[...] > > I have been made aware that there is a > > cstart = cend = FOLD (cstart); > > inside the `sm_loop.c` file that will convert into a range many > > individual character. If that understanding is correct that is the > > source of the difference with other shells. > > I'm not sure what you mean by "convert into a range." If cstart and cend > were treated as a range, the start end and end characters would be the > same. If cstart == cend, a character that collates >= cstart and <= cend > would have to collate equal to cstart and cend. > Yes, exactly, a range where the start and the end are the same. Try: $ touch 0 1 ٠ ١ ۰ ۱ ߀ ߁ ० १ $ echo [1] 1 ١ It is converted to the same range as this $ echo [1-1] 1 ١ That happens because up to glibc 2.27 this has been the collation order of those characters (search in /usr/share/i18n/locales/iso14651_t1_common) : <U0030> <0>;<BAS>;<MIN>;IGNORE <U0660> <0>;<BAS>;<MIN>;IGNORE Collate to exactly the same values. This breaks the capacity to detect that a character is absent in a list ordered by the collation order.