Tim Robbins wrote:
Rein Kadastik wrote:
Giorgos Keramidas wrote:
On 2005-09-03 14:17, Rein Kadastik <[EMAIL PROTECTED]> wrote:
Rein Kadastik wrote:
Well I have one guess here. In estonian alphabet, the z comes
immediately after s and before t. So as the regex orders [a-z] the
characters t, u, v, w, x, y are left out
How to order the sed to use english alphabet?
Well, My guess was right. I have a following line in the /etc/profile:
export LANG=et_EE.ISO8859-15
After I expoerted LANG=en_US.ISO8859-1, the sed started to work.
I did not thought that LANG parameter will also alter the alfabet and
therefore the expression [a-z] does not cover the full alphabet
anymore.
By using a character class:
[[:alpha:]]
AFAIK, if you are using non-English locales, there's no guarantee that
[a-z] will be the entire set of lowercase letters, or that it will only
include lowercase letters, for that matter.
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to
"[EMAIL PROTECTED]"
Yep, I know but it does not matter. The form [a-z] is used all over
the place in the FreeBSD source (1629 lines in 4.11-RELEASE-p11 and
almost 1600 in 5-STABLE). Totally hopeless. Seems, that no developer
have ever heard about character classes and it VERY UNSAFE to try to
compile (and actually even run) FreeBSD with some other locale than
C/en_US.ISO8859-1.
I actually searched for existance of character classes in source
code. Found around 30 matches. Mostly in manual pages. Perl configure
script checks if tr supports them, but it actually never uses the
featuire (even if available).
I am totally dissappointed about this. I thought about reporting a
bug, but as it is everywhere, there is no point to do so.
I think you're blowing things out of proportion. Providing that you
build world as root (which most people do), and that you don't change
the LANG setting for root (think single-user mode), the following
command will give you an approximate idea of which utilities are
affected:
$ find /usr/src -name \*.c | xargs grep -e '".*a-z' -e '".*A-Z'
25
Of these 25 hits, about half are in comments or test code that is
never built. The utilities that are genuinely affected are: kbdmap,
scon, ppp (when using ATM), m4 (in GNU compatibility mode), fdisk,
named, cvs, diff and vi.
Tim
Well not quite. For starters, the modules that fail for my buildworld
are ncurses, csh/tcsh and gdb (interesting that so few as the problem
itself is way bigger). Secondly there are not 25 results but a bit more
(most of the regex'es are not in .c files). Third, I already sent email
to Ruslan and am waiting fore a response. I am fully aware of the size
of such a project and quite willing to try to make things better.
And BTW my systemwide LANG is set to et_EE.ISO8859-15 which I personally
like. As the system provides localization functionality, it must handle
it in every situation apropriately (which is not the case right now).
Peace
-- Rein
Rein
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"