Hi, > length() returns the length in characters, which > for ASCII is also the number of bytes. To get > the bits, just multiply by 8.
> If you are using a Unicode character set > instead, I'm not too sure what will be returned, > or how you can convert it to bits. Unicode can get pretty hairy, but it's my impression that the number of bytes per character varies depending on your encoding. UTF-8, the defacto standard nowadays, has variable length encoding -- characters can take between 3 and 6 bytes, if I recall correctly. I was curious about trying this out, so I modified a crufty little script I had hanging around. The bottom line was that length returns characters too, just, Unicode characters. Combining characters count as unique. Anyway, I use it like this: $ perl describechars in.utf8 > out.utf8 Then you can view out.utf8 with an editor that can grok whatever language you happen to be dealing with. Sure enough, Perl counts characters, not bytes, with Unicode text. More scintillating details at: http://www.perldoc.com/perl5.8.0/pod/perlun- icode.html If you have the module Unicode::CharName (not sure if that's core nowadays), you can try out my goofy script: #!/usr/bin/perl -w use Unicode::CharName qw(uname ublock); use strict; my @chars = (); while (<>) { chomp; print "~-" x 15, "\n"; $_ =~ s/^\s+//; $_ =~ s/\s+$//; @chars = split //, $_; print "$_\n"; # the line print join ' + ', @chars; # the individual chars print "\nlength is: ", length($_); print "\n"; for my $char (@chars) { print "[ $char ]\t"; print uname( ord($char) ), # uname prints Unicode names. "\t", hex( ord($char) ), "\n"; } } __________________________________ Do you Yahoo!? Yahoo! SiteBuilder - Free, easy-to-use web site design software http://sitebuilder.yahoo.com -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]