I forgot to explain.
> > 1. I want to read in a text file and match any line that begins with
> > three capital letters followed by a space. i.e. "USD "
>
> while (<>) {
<> will read from the file(s) you specify on the command
line when you run your perl script, ie
perl myscript.pl inputfile
or just
myscript.pl inputfile
The magical incantation:
while (<>) {
reads through the input file(s) a line at a time,
putting the line in $_, which is a special 'default'
variable that is assumed by lots of other perl
functions.
> /^[A-Z]{3} / and dostuff; # $_ contains line
Use
/stuff/
to match something in $_.
In a // expression, ^ matches at the start of a line (sorta).
($ matches at the end of a line.)
[abc] is a // expression "atom" that matches a, b, or c.
[a-c] would do the same job.
Following some atom with {min,max} tells perl how many
times to match.
The word 'and' means, if the thing on the left is true,
then also do the thing on the right.
dostuff was a made up name of a sub procedure that
you would have to declare elsewhere like this:
sub dostuff {
# code
}
> > 2. I need to ignore any blank lines, lines containing all "---",
lines
> > containing all "===".
>
> while (<>) {
>
> /^(\s|-|=)*$/ and next;
I got this wrong.
The basic principle was to use () brackets to turn the content
into a // expression atom, then use * after that to tell perl how
many times to match the atom. This just like {min,max}. * is
shorthand for {0,infinity}.
The \s means match any whitespace character. If a bunch of
things in a //, or in a () enclosed atom, are separated by |
symbols, then perl can match any of the separated things.
I got it wrong because what I wrote will match, say:
-= -= -= -= -= -= -= - =- = - ====-=
or other combinations.
This would more accurately fit what you asked for:
/^(\s*|-*|=*)$/ and next;
The word 'next' means to go on to the next iteration
in the loop code containing the 'next' command.