The normal, efficient Perl 5 way to open a file and do work on it.
open FILE, "some/file";
while(<FILE>) {
print;
}
close FILE;
Now consider this Perl 5 code that does the exact same thing.
foreach_line { print } 'some/file';
Ok, that's sort of neat. Very compact. Minimum of fuss. You might
implement this like so:
sub foreach_line (&$) {
my($block, $file) = @_;
open FILE, $file || die "Can't open $file: $!";
while(<FILE>) {
$block->();
}
close FILE;
}
But there's two problems. First, that will only work on a function.
Second, we have to make a subroutine call for each and every line of
the file. While it's not as expensive as you might think, it's still
expensive.
What if instead of this pseudo-block argument that's actually a code
reference thing we currently have, we could actually attach a real
block to a function or method call?
foreach_line('some/file') {
print;
}
or how about a foreach() method of the File class? [1]
File.foreach('some/file') {
print;
}
Then it would be just as efficient as a regular loop! One might write
File::foreach() like so:
sub File::foreach {
my($class, $file) = @_;
open(FILE, $file) || die "Can't open $file: $!";
while(<FILE>) {
yield;
}
close FILE;
}
yield() [2] simply says "run the block associated with this method
once". Similar to the $block->() call, but since it's not a full
subroutine call, just a block enter/exit (like a normal iteration
through a loop) there's two important differences.
1) caller() is not effected.
2) it's about as efficient as a regular loop.
This iterator syntax isn't limited to files. Consider File::Find.
File.find('some/dir', 'some/other/dir') {
print if -x && /nethack/;
}
Or DBI
$sth.foreach {
print "Data: ". join ' - ', @$_;
}
If we add in the ability to pass in named arguments, it becomes even
more powerful. Want a foreach loop to get each element of an array
*and* the current index? (I'm assuming arrays will be objects)
@array.foreach { |$elem, $idx| # [3]
print "Row #$idx - $elem\n";
}
You might implement this as:
sub ARRAY::foreach {
my($self) = shift;
for my $idx (0..$#$self) {
yield($self[$idx], $idx);
}
}
The arguments to yield() get aliased into the block as $elem and $idx.
How about a simple, efficient implementation of a grep that only finds
the first element?
my $first = first @list { /foo/ };
sub first {
my(@list) = @_;
foreach (@list) {
return $_ if yield;
}
}
yield() returns the last evaluated expression of it's block.
In fact, I'd go so far as to say junk the current meaning of the &
prototype, which has never really lived up to it's promises, and
change it to mean a true block. This allows the more natural form...
my $first = first { /foo/ } @list;
sub first (&@) {
my(@list) = @_;
foreach (@list) {
return $_ if yield;
}
}
This is an extremely powerful, concise, flexible and efficient
construct. Having used it in Ruby, I feel it would make a lovely
addition to Perl 6, allowing us to smooth out the syntax of some of
our traditionally nasty modules (FileHandle, IO, File::Find, etc...)
and handily answer some of our most common loop feature requests.
The particular details of the syntax are not so important as these two
important features:
1) The ability to attach a block to a method/function call.
2) That it's as efficient as a regular loop.
[1] Astute readers might recognize this as Ruby code.
[2] Doesn't have to be called yield(). That's what Ruby and CLU call it.
[3] This is an adoption of the Ruby syntax. Don't know how to perlify it.
--
Michael G. Schwern <[EMAIL PROTECTED]> http://www.pobox.com/~schwern/
Perl6 Quality Assurance <[EMAIL PROTECTED]> Kwalitee Is Job One
Death follows me like a wee followey thing.
-- Quakeman