Thanks to much help from the list, and hours of reading up on Unicode, the Encode module, and many posts to perlmonks, I've come up with a hideous solution for processing text files with different character encodings.
Can someone please explain why this first block of code works when decoding .txt files of different character encoding types: #!/usr/bin/perl use strict; use warnings; use Encode::Guess; print "\nPlease specify the file path: "; my $datapath = <STDIN>; $datapath =~ s/^\s+//; $datapath =~ s/\s+$//; open (my $filehndl , "<", "$datapath") || die ("Can't open .txt file $datapath. Exiting program.\n\n"); binmode($filehndl); if (read($filehndl, my $filestrt, 500)) { my $enc = guess_encoding($filestrt); if (ref($enc)) { my $enc_name = $enc->name; #my $encoding = find_encoding("$enc_name"); open (my $filehdl2 , "<:encoding($enc_name)" , "$datapath"); while (my $line = <$filehdl2>) { #my $line = $encoding->decode($string); #my $line = decode("$enc_name", $string); chomp $line; my @words = split / /, $line; my $nr_words = @words; print "\n$line\n"; print "The line above has " . scalar @words . " occurrences of something.\n"; } close ($filehdl2); } } close ($filehndl); But this second generates the error: UTF16: Unrecognised BOM 6100 at /usr/lib/perl/5.10//Encode.pm line 162, <$filehndl> line 1. #!/usr/bin/perl use strict; use warnings; use Encode; use Encode::Guess; print "\nPlease specify the file path: "; my $datapath = <STDIN>; $datapath =~ s/^\s+//; $datapath =~ s/\s+$//; open (my $filehndl , "<", "$datapath") || die ("Can't open .txt file $datapath. Exiting program.\n\n"); binmode($filehndl); if (read($filehndl, my $filestrt, 500)) { my $enc = guess_encoding($filestrt); if (ref($enc)) { my $enc_name = $enc->name; while (my $line = decode("$enc_name", <$filehndl>)) { chomp $line; my @words = split / /, $line; my $nr_words = @words; print "\n$line\n"; print "The line above has " . scalar @words . " occurrences of something.\n"; } } } close ($filehndl); Otherwise, can someone suggest a more elegant way of accomplishing this? It doesn't seem like I should have to open the file twice, as I'm doing in the first block. I can't figure out any way around that, though. Thanks for any help! -Doug. === Douglas Cacialli, M.A. - Doctoral candidate Clinical Psychology Training Program University of Nebraska-Lincoln Lincoln, Nebraska 68588-0308 === -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/