Custom iterators

Michael G Schwern Mon, 24 Sep 2001 18:29:42 -0700
The normal, efficient Perl 5 way to open a file and do work on it.

    open FILE, "some/file";
    while(<FILE>) {
        print;
    }
    close FILE;

Now consider this Perl 5 code that does the exact same thing.

    foreach_line { print } 'some/file';

Ok, that's sort of neat.  Very compact.  Minimum of fuss.  You might
implement this like so:

    sub foreach_line (&$) {
        my($block, $file) = @_;

        open FILE, $file || die "Can't open $file:  $!";
        while(<FILE>) {
            $block->();
        }
        close FILE;
    }

But there's two problems.  First, that will only work on a function.
Second, we have to make a subroutine call for each and every line of
the file.  While it's not as expensive as you might think, it's still
expensive.

What if instead of this pseudo-block argument that's actually a code
reference thing we currently have, we could actually attach a real
block to a function or method call?

    foreach_line('some/file') {
        print;
    }

or how about a foreach() method of the File class? [1]

    File.foreach('some/file') {
        print;
    }

Then it would be just as efficient as a regular loop!  One might write
File::foreach() like so:

    sub File::foreach {
        my($class, $file) = @_;

        open(FILE, $file) || die "Can't open $file:  $!";
        while(<FILE>) {
            yield;
        }
        close FILE;
    }

yield() [2] simply says "run the block associated with this method
once".  Similar to the $block->() call, but since it's not a full
subroutine call, just a block enter/exit (like a normal iteration
through a loop) there's two important differences.

1)  caller() is not effected.
2)  it's about as efficient as a regular loop.


This iterator syntax isn't limited to files.  Consider File::Find.

    File.find('some/dir', 'some/other/dir') {
        print if -x && /nethack/;
    }

Or DBI

    $sth.foreach {
        print "Data:  ". join ' - ', @$_;
    }

If we add in the ability to pass in named arguments, it becomes even
more powerful.  Want a foreach loop to get each element of an array
*and* the current index?  (I'm assuming arrays will be objects)

    @array.foreach { |$elem, $idx|          # [3]
        print "Row #$idx - $elem\n";
    }

You might implement this as:

    sub ARRAY::foreach {
        my($self) = shift;

        for my $idx (0..$#$self) {
            yield($self[$idx], $idx);
        }
    }

The arguments to yield() get aliased into the block as $elem and $idx.

How about a simple, efficient implementation of a grep that only finds
the first element?

    my $first = first @list { /foo/ };

    sub first {
        my(@list) = @_;

        foreach (@list) {
            return $_ if yield;
        }
    }

yield() returns the last evaluated expression of it's block.

In fact, I'd go so far as to say junk the current meaning of the &
prototype, which has never really lived up to it's promises, and
change it to mean a true block.  This allows the more natural form...

    my $first = first { /foo/ } @list;

    sub first (&@) {
        my(@list) = @_;

        foreach (@list) {
            return $_ if yield;
        }
    }


This is an extremely powerful, concise, flexible and efficient
construct.  Having used it in Ruby, I feel it would make a lovely
addition to Perl 6, allowing us to smooth out the syntax of some of
our traditionally nasty modules (FileHandle, IO, File::Find, etc...)
and handily answer some of our most common loop feature requests.

The particular details of the syntax are not so important as these two
important features:

    1) The ability to attach a block to a method/function call.
    2) That it's as efficient as a regular loop.


[1] Astute readers might recognize this as Ruby code.

[2] Doesn't have to be called yield().  That's what Ruby and CLU call it.

[3] This is an adoption of the Ruby syntax.  Don't know how to perlify it.

-- 

Michael G. Schwern   <[EMAIL PROTECTED]>    http://www.pobox.com/~schwern/
Perl6 Quality Assurance     <[EMAIL PROTECTED]>       Kwalitee Is Job One
Death follows me like a wee followey thing.
        -- Quakeman
Custom iterators

Reply via email to