Thomas Lotze wrote: > A related problem is skipping whitespace. Sometimes you don't care about > whitespace tokens, sometimes you do. Using generators, you can either set > a state variable, say on the object the generator is an attribute of, > before each call that requires a deviation from the default, or you can > have a second generator for filtering the output of the first.
Last night's sleep was really productive - I've also found another way to tackle this problem, and it's really simple IMO. One could pass the parameter at generator instantiation time and simply create two generators behaving differently. They work on the same data and use the same source code, only with a different parametrization. All one has to care about is that they never get out of sync. If the data pointer is an object attribute, it's clear how to do it. Otherwise, both could acquire their data from a common generator that yields the PDF content (or a buffer representing part of it) character by character. This is even faster than keeping a pointer and using it as an index on the data. -- Thomas -- http://mail.python.org/mailman/listinfo/python-list