On 4/15/05, Rajarshi Das <[EMAIL PROTECTED]> wrote: > In a script testing barewords, the character 'tau' displays when opened > using vi editor on linux. OTOH, the same character doesnt display on z/OS > and shows as ^69^22 in the vi editor. > The failing line in the script is : > %hash = (^69^22.... => 123) > > Perl (5.8.6) complains when it reads thus : > "Unrecognized character \x69". > > If I just write \x69\x22, perl doesnt understand that I am implying the > character 'tau'. > > Is there a way to display the 'tau' character as a bareword, without > interpreting it as "\x69\x22", on a ebcdic platform ? > > Thanks for all the help, > Rajarshi. > > >From: Chris Devers <[EMAIL PROTECTED]> > >Reply-To: Perl Beginners List <beginners@perl.org> > >To: Rajarshi Das <[EMAIL PROTECTED]> > >CC: Perl Beginners List <beginners@perl.org> > >Subject: Re: what are utf8 barewords.? > >Date: Thu, 14 Apr 2005 10:27:13 -0400 (EDT) > > > >On Thu, 14 Apr 2005, Rajarshi Das wrote: > > > > > Barewords acccording to perldata.pod are "words that donot have any > > > other meaning in grammar". > > > > > > 1) So, does this mean that any word which is not reserved is a > > > bareword ? > > > >Off the top of my head, every "token" of text in Perl is either: > > > > * an operator: +, *, =~, s///, .. > > > > * a built-in function: chomp(), map(), grep() > > > > * an imported function or method from a module: $cgi->param() > > > > * a user defined subroutine or method: do_stuff_with() > > > > * a string: "including" qw{ things like }, qq[ this ], 'or', "this" > > > > * a bareword: FILEHANDLE, etc > > > >I may have missed a class or two, but that's most of them. > > > > > 2) What exactly would be a utf8 bareword ? Is it any utf8 encoded > > > character ? > > > >A non- operator / function / method / subroutine / string that includes > >one or more UTF8 characters. > > > > > Any examples ? > > > Would "\x69\x22" qualify as a utf8 bareword ? > > > >Well, if used exactly as you have it there, it's a string, because it's > >wrapped in double quotes. If you just had > > > > \x69\x22 > > > >by itself, then yes, it would be a bareword. > > > > > > > >-- > >Chris Devers
Simply put, it's a bareword because you didn't double quote it. You're also not using perl's unicode notation. That's part of the issue. The other is the encoding. Right now, both your editors are using different encodings, and neither of them is unicode; 'GREEK SMALL LETTER TAU' is U+03C4 and 'GREEK CAPITAL LETTER TAU' is U+03A4. This isn't perl's fault. To make these systems talk to each other, you're going to have to explicity give the code points for non ascii characters, instead of relying on your editor, which will give you ebcdic code points on the one hand, and, probably, ISO-Latin-something on the other. So: %hash = ("\x{03a4}" => 123) When it comes to saving and printing, particularly on ebcdic systems, there are other issues, but take a look through perldoc perluniintro perldoc perlunicode perldoc perlebcdic perldoc Encode perldoc PerlIO perldoc utf8 I know that looks like a long list, but maintaining encodings across different platforms is tricky once you get into extended charater sets because it's not enough for perl to get it right; you have to be able to print it to screen without getting "wide character in..." errors. HTH, --jay -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>