Aaron Sherman wrote:
So if I was to write my own function, and wanted it to be in the middle of a pipeline, but I wanted it to have a lazy evaluation feature, how would I write that?On Mon, 2004-08-30 at 16:34, Rod Adams wrote:
@x = @y ==> map lc ==> grep length == 4;
I would think you actually want to be able to define grep, map, et al. in terms of the mechanism for unraveling, and just let the optimizer collapse the entire pipeline down to a single map.
To propose one way of doing it (and really just a simple example off the top of my head which may not be the best idea...):
macro grep(&cond, [EMAIL PROTECTED]) is mapper { if cond($_) { ($_) } else { () } } macro map(&cond, [EMAIL PROTECTED]) is mapper { cond($_) }
Which would do two things:
1. Define a subroutine of the same name that does the full map:
sub grep (&cond, [EMAIL PROTECTED]) is mapper { my @result; for @list -> $_ { push @result, $_ if cond($_) } return @result; }
2. Populates an optimizer table with just the macro form
When you see C<grep {...} map {...} grep {...}> now, you can just consult that table and determine how much of the pipeline can be transformed into a single map.
There, you're done. No more pipeline overhead from list passing.
Now, of course you still have things like sort where you cannot operate on single elements, but those cases are more difficult to correctly optimize, and lazy lists won't help those either.
For reference, let's use the following problem:
Take as input a list of numbers. For every five numbers, return the median element of that grouping. If the total list length is not divisible by five, discard the rest. (Not the most useful, but this is just example code).
Basic sub to do this:
sub MediansBy5 ([EMAIL PROTECTED]) { my @result; while @list.length >= 5 { push @result, (sort @list.splice(0,5))[2]; } return @result; }
I would suspect that this would not be evaluated in a lazy manner, especially because I'm pooling the results in @result.
One solution I see to this would be to have a "lazy return" of some kind, where you can send out what results you have so far, but not commit that your execution is over and still allow further results to be posted. For lack of better word coming to mind, I'll call a "lazy return" C<emit>.
Example above becomes:
sub MediansBy5 ([EMAIL PROTECTED]) { while @list.length >= 5 { emit (sort @list.splice(0,5))[2]; } }
And then you could have the relatively simple:
sub grep (&cond, [EMAIL PROTECTED]) { for @list -> $_ { emit $_ if cond($_); } }
sub map (&func, [EMAIL PROTECTED]) { for @list -> $_ { emit func($_); } }
Thoughts?
-- Rod Adams