On Wed, Nov 9, 2016 at 6:38 AM, Steve D'Aprano <steve+pyt...@pearwood.info> wrote: > And here's an implementation for arbitrary n-grams: > > > def ngrams(iterable, n=2): > if n < 1: > raise ValueError > t = tee(iterable, n) > for i, x in enumerate(t): > for j in range(i): > next(x, None) > return zip(*t) > > > Can we do better, or is that optimal (for any definition of optimal that you > like)?
The tee object uses O(n) space, which I believe is unavoidable. Time-wise it's O(n^2 + nm) = O(nm) where m is the size of the iterable. O(nm) is also the total size of the output, so that's also unavoidable with the assumption that the caller is going to consume the entire output. If that assumption doesn't hold, then perhaps this could be improved by returning an iterator of iterators rather than of tuples (with the proviso that the contents of the sub-iterator become unavailable once the main iterator has been advanced; itertools.groupby works similarly). -- https://mail.python.org/mailman/listinfo/python-list