If setting LC_CTYPE to this: $ export LC_CTYPE=fr_FR.ISO8859-15
and then: $ echo "éé" | sed 's/é/\é/g' sed: 1: "s/é/\é/g": RE error: trailing backslash (\) Where does the program manage to find a backslash i.e. 0134? While 'é' is 0351. Since, to my knowledge, we do not support anything via iconv or whatever, shouldn't we assume simply a string of bytes \`a la C, that is: diff --git a/usr.bin/sed/main.c b/usr.bin/sed/main.c index d87bce2a5c85..c6b69a83cd57 100644 --- a/usr.bin/sed/main.c +++ b/usr.bin/sed/main.c @@ -136,7 +136,7 @@ main(int argc, char *argv[]) char *temp_arg; setprogname(argv[0]); - (void) setlocale(LC_ALL, ""); + (void) setlocale(LC_ALL, "POSIX"); fflag = 0; inplace = NULL; ? With such a change, the result is: $ echo "éé" | ./sed 's/é/\é/g' éé and this is what I expected. What is the rationale for taking environment when all the code in the src expects ASCII to start with? (for commands, range and so on). What am I doing wrong? -- Thierry Laronde <tlaronde +AT+ polynum +dot+ com> http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C