Moin Alex, On Saturday 28 January 2006 01:35, [EMAIL PROTECTED] wrote: > Hello, > > I was doing some I18N of a bunch of existing CGI scripts and > encountered a problem. > I guess I'm making some very basic error, but I'm stuck with this for a > day and I thought > I may ask. I have my strings in UTF-8. I read most of them from file, > do some processing > and spit them out of the CGI-script. > > Let say I do this: > > $x=~y/a-ya/A-YA/; > > Here, with "a" I mean cyrillic "a" (1'st letter of the cyrillic > alphabet), with "ya" - ciryllic "ya" > (last letter of the cyrillic alphabet). I don't want to post ciryllic > chars here - I don't know how > they will show, but you understand what I mean. > > This doesn't work properly. I suppose it should convert the characters > to uppercase, but > what happens is that some characters do not get converted to uppercase, > while other get > converted to wrong characters. > > Another thing is that when I say "substr($cyrillicString,5,1)", the > character returned is > always invalid (shows as a white question on a black diamond). All > other cyrillic strings, > that are not manipulated show properly. The problem happens when I try > to get a character > from a string, to split it, things like this. > > What am I doing wrong? If you reply RTFM, I'll understand and will not > complain... :-)
Did you do: binmode ':utf8', STDIN; (or the equivalent) when reading UTF-8 from a file? best wishes, Tels What Perl version do you use? You may have to upgrade, because things prior to 5.8.2 (or even later) are a bit buggy. In addition, I think that tr/// doesn't work properly with unicode, have you tried uc($string)? In additon, what is the output charset set by your CGI? You need to tell the browser that you output utf-8, or it will/might incorrectly guess a different charset. Best wishes, Tels -- Signed on Sat Jan 28 11:13:40 2006 with key 0x93B84C15. Visit my photo gallery at http://bloodgate.com/photos/ PGP key on http://bloodgate.com/tels.asc or per email. "Where shall I put you? Under H, like Hot, Sexy Mama?"
pgpblIVxtmahK.pgp
Description: PGP signature