Re: Best practice for operations on streams of text

2009-05-17 Thread Beni Cherniavsky
On May 8, 12:07 am, MRAB wrote: > def compound_filter(token_stream): >      stream = lowercase_token(token_stream) >      stream = remove_boring(stream) >      stream = remove_dupes(stream) >      for t in stream(t): >          yield t The last loop is superfluous. You can just do:: def compoun

Re: Best practice for operations on streams of text

2009-05-07 Thread Terry Reedy
MRAB wrote: James wrote: Hello all, I'm working on some NLP code - what I'm doing is passing a large number of tokens through a number of filtering / processing steps. The filters take a token as input, and may or may not yield a token as a result. For example, I might have filters which lowerc

Re: Best practice for operations on streams of text

2009-05-07 Thread MRAB
James wrote: Hello all, I'm working on some NLP code - what I'm doing is passing a large number of tokens through a number of filtering / processing steps. The filters take a token as input, and may or may not yield a token as a result. For example, I might have filters which lowercases the inpu

Re: Best practice for operations on streams of text

2009-05-07 Thread Gary Herron
James wrote: Hello all, I'm working on some NLP code - what I'm doing is passing a large number of tokens through a number of filtering / processing steps. The filters take a token as input, and may or may not yield a token as a result. For example, I might have filters which lowercases the inpu

Re: Best practice for operations on streams of text

2009-05-07 Thread J Kenneth King
James writes: > Hello all, > I'm working on some NLP code - what I'm doing is passing a large > number of tokens through a number of filtering / processing steps. > > The filters take a token as input, and may or may not yield a token as > a result. For example, I might have filters which lowerca

Best practice for operations on streams of text

2009-05-07 Thread James
Hello all, I'm working on some NLP code - what I'm doing is passing a large number of tokens through a number of filtering / processing steps. The filters take a token as input, and may or may not yield a token as a result. For example, I might have filters which lowercases the input, filter out b