I'm completely baffled by this and not entirely sure where to start.

I have a plain text  file, testfile.txt, which contains a single line:

Very truly yours,

It is written exactly how you see it above, with a newline at the end.

I'm trying to write a script that will determine the number of words
in the file.  A snippet of what I have thus far is the following:

my $fh = new IO::File("$lvl2path/$filestng", "r") ||
        die ("Can't open .txt file named at $lvl2path.  Exiting program.\n\n");
while (my $line = $fh->getline())
{
        my @words = split /\s+/, $line;
        my %count = ();
        $count{$line} += @words;
        print "$line";
        print "The line above has " . scalar @words . " occurrences of 
something.\n";
}
$fh->close();

That outputs the following:

V e r y  t r u l y  y o u r s ,
The line above has 3 occurrences of something.

I understand that spilt /\s+/ is matching whitespace characters, and
I'm pleased that it comes back with 3 (two spaces and the newline).
What I don't understand is why the output has spaces between all the
letters.  I've looked at this and other .txt files in different
editors on different OS's; I can't find any hidden characters,
whitespace or other, anywhere they don't belong.  What's really
concerning is when I change the above such that:

my @words = split /\w+/, $line;

I get this:

V e r y  t r u l y  y o u r s ,
The line above has 15 occurrences of something.

Where is this whitespace coming from between the letters??  Is it
really whitespace (/\s+/ doesn't catch it, but /\w+/ is catching each
character as if there's whitespace between)??  A good part of my
dissertation hinges on being able to read thousands of .txt files
without the extraneous spaces that are being introduced somewhere.

By the way, only some files appear affected, but there's no obvious pattern.

Any hints would be wildly appreciated.

===
Douglas Cacialli, M.A. - Doctoral candidate
Clinical Psychology Training Program
University of Nebraska-Lincoln
Lincoln, Nebraska 68588-0308
===

-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to