On 07.11.2016 22:13, Charles Swiger wrote: > On Nov 6, 2016, at 1:49 PM, Stefan Bethke <s...@lassitu.de> wrote: >> Am 06.11.2016 um 22:27 schrieb Baptiste Daroussin >> <b...@freebsd.org>: >>> That works for POSIX locale aka C aka ASCII only world >> >> So what do I set my LANG and LC variables to? I do want UTF-8, but >> I do also want my scripts to continue to work. Clearly, >> en_US.UTF-8 is not what I want. Is it C.UTF-8? Or do I set >> LANG=en_US.UTF-8 and LC_COLLATE=C? > > If you want to use a UTF8 locale, then you must start using character > classes like '[:upper:]' and '[:lower:]' because those will-- or at > least "should", modulo bugs-- properly handle the collation issues > including for languages which do not possess a 1-1 mapping between > upper and lower case letters. > > Someone with a German email address is presumably familiar with ß / > Eszett...? :-)
Character classes work fine for [a-z], but I don't know of a simple way to match a range like [a-k]. Personally, I prefer the "Rational Range Interpretation" because it doesn't break backward compatibility and is still standard compliant. _______________________________________________ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"