On Jun 25 18:03, Corinna Vinschen wrote: > On Jun 25 15:38, Lavrentiev, Anton (NIH/NLM/NCBI) [C] wrote: > > > Your locale is zh_CN.UTF-8. What you're expecting is only guaranteed > > > in the C locale: > > > > I'm not quite sure it applies here. I'm using US English Windows 7. > > > > LANG = 'en_US.UTF-8' > > > > I get the same result: > > > > $ echo abcdeABCDE | sed -e 's/[B-D]/_/g' > > ab__eA___E > > > > BUT: > > > > $ echo abcdeABCDE | LANG=C sed 's/[B-D]/_/g' > > abcdeA___E > > > > This is very weird, indeed. > > > > OTOH, in Linux I have the same LANG setup, yet it does work > > correctly: > > > > > echo $LANG > > en_US.UTF-8 > > > echo abcdeABCDE | sed -e 's/[B-D]/_/g' > > abcdeA___E > > > > I believe that an en_US UTF-8 string representation for > > "abcdeABCDE" is not any different from ASCII. > > Wrong. Try this: > > $ sort > a > b > c > d > e > A > B > C > D > E > <Ctrl-D> > a > A > b > B > c > C > d > D
Which also means, AFAICS, Cygwin's sed is doing it right, Linux' sed is doing it wrong. Yes, that puzzles me a bit at the moment, too. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple