Lazy lists and optimizing for responsiveness

Yuval Kogman Mon, 19 Sep 2005 05:18:41 -0700

One thing that is extraordinarily hard to do with the facilities we
have today is finding the responsive optimum between laziness and
eagerness.


Let's use an example.

WWW::Mechanize comes with a nice example script for mailing list
moderation.

This script can be rather easily hacked to work on several mailing
lists at once.

Let's look at several approaches to how responsiveness of the
program could be improved:

1. lazy loading

my @messages = gather {
        for @lists -> $mailing_list {
                take $mailing_list.messages; # mech logs into the mailman
                # interface and collects the subjects. This list is possibly
                # lazy too
        }
}

for @messages -> $message {
        # prompt with $message.subject and what not
}


This will pause between each mailing list, but present the first
message as early as possible.

Responsiveness is pretty good overall, but occasionally the user
must wait.

2. strict loading

my @messages = ** gather { ... }

This will pause once in the begining, presenting the user with a all
the messages at once.

Responsiveness is high between messages (never any wait between two
messages) but there is a long startup time.

3. multithreaded approach

my @messages;
my $done;
async {
        for @lists -> $mailing_list {
                push @messages, $mailing_list.messages
        }
        $done = 1;
};

while (!$done) {
        while (@messages) { 
                my $message = shift @messages;
                ...
        }
        
        wait_for_write(@messages); # assume there is no race condition
        # between the depletion and the write
}

This solution lacks the elegance of the lazy loading approach, but
has the best responsiveness. These implementations tend to be overly
complex for what they do, and hence not worth the maintenance costs.

The gain is that the user only has to wait for the first message,
and if the throughput of the system is higher than the user's input,
there is no responsiveness loss because something is always ready
for the user to get at.

Ideally we could have the lazy list approach have some kind of way
to modify the lazyness so that it's something in between -
background eagerness - the lazy list is resolved in the background,
accumilating to @messages. When @messages is filling up quickly the
background resolution thread might get a lower priority. when
@messages is empty, the receiver side has to block.

This behaves much like UNIX pipes - when the buffer window is filled
the writer process blocks.

The problem with UNIX pipes is that there is no orthogonality -
there is one input and one output. But they do make sense for
character streams.

The lazy pulling model lets us model this with higher level data,
and more flexibility.

Note: I have an idea of how easy I want this to be, but not how I
want to do it.

I think that a nice solution is to allow an optional named adverb to
gather which defaults to lazy:

        gather : async {

        }

        gather : lazy {

        }

        gather : eager { # or strict

        }

These behaviors can then cascade nicely. For example to get the full
result set from many streams simultaneously you do something like

my @full = gather : eager { # doesn't return until the inner gathers are 
depleted
        take gather : async { ... }
        take gather : async { ... }
}

Another interesting mechanism is the push side of the deal. I think
a give synonym for 'take' is in order.

In the listmod script the prompt loop will asynchronously take from
the message list, and give to the change submission list.

This allows lazifying that is not only in callee context.

async {
        for @delete_list -> $message {
                $message.delete; # WWW::Mechanize again
        }
}

for @messages -> $message {
        # prompt
        @delete_list.give($message) if $result_of_prompt ~~ Delete;
}

This gets even more interesting when we want to optimize the
deletion code so that messages are deleted in clusters, to improve
throughput. Something like

for :async :noblock @delete_list -> [EMAIL PROTECTED] {
        # @messages is filled with as many messages as there were
        # available right now
        $mech.delete(@messages); # bah
}

We could wrap such that:

my @delete_list = gather {
        take $message if prompt() ~~ Delete;
}

but to improve readability the point of balance should be the prompt
loop, not the deletion loop. I back this point with the comparison
of two descriptions of the program:

        listmod is a program that prompts about each message in a number
        of mialing list interfaces, and deletes the ones you asked it
        to.

        listmod is a program that deletes all the messages that the user
        asked to delete from a mailing list interface.

I find that the second description requires more mental stack space
and IMHO the code should be written that way too.

Also, 'giving' from the prompt loop is also more scalable since you
can have more than the delete queue - messages are also approved.

With a pull only model you need to part the list and iterate the two
result sets asynchronously. With a push model you give your handler
results as they come in, and it gobbles them up in the background as
it sees fit (chunked, as fast as possible, with a delay loop...
whatever).

-- 
 ()  Yuval Kogman <[EMAIL PROTECTED]> 0xEBD27418  perl hacker &
 /\  kung foo master: /me spreads pj3Ar using 0wnage: neeyah!!!!!!!!!!!

pgpzoi3G0SfZH.pgp
Description: PGP signature

Lazy lists and optimizing for responsiveness

Reply via email to