|I l@ve RuBoard|
17.13 Operating on Iterators
Credit: Sami Hangaslammi
You need to operate on iterators (including normal sequences) with the same semantics as normal sequence operations, except that lazy evaluation is a must, because some of the iterators involved could represent unbounded sequences.
Python 2.2 iterators are easy to handle via higher-order functions, and lazy evaluation (such as that performed by the xrange built-in function) can be generalized. Here are some elementary operations that include concatenating several iterators, terminating iteration when a function becomes false, terminating iteration after the first n values, and returning every nth result of an iterator:
from _ _future_ _ import generators def itercat(*iterators): """ Concatenate several iterators into one. """ for i in iterators: i = iter(i) for x in i: yield x def iterwhile(func, iterator): """ Iterate for as long as func(value) returns true. """ iterator = iter(iterator) while 1: next = iterator.next( ) if not func(next): raise StopIteration # or: return yield next def iterfirst(iterator, count=1): """ Iterate through 'count' first values. """ iterator = iter(iterator) for i in xrange(count): yield iterator.next( ) def iterstep(iterator, n): """ Iterate every nth value. """ iterator = iter(iterator) while 1: yield iterator.next( ) # Skip n-1 values for dummy in range(n-1): iterator.next( )
A bit less elementary, but still generally useful, are functions that transform an iterator's output, not just selecting which values to return and which to skip, but actually changing the structure. For example, here is a function that bunches up an iterator's results into a sequence of tuples, each of length count:
from _ _future_ _ import generators def itergroup(iterator, count, keep_partial=1): """ Iterate in groups of 'count' values. If there aren't enough values for the last group, it's padded with None's, or discarded if keep_partial is passed as false. """ iterator = iter(iterator) while 1: result = [None]*count for x in range(count): try: result[x] = iterator.next( ) except StopIteration: if x and keep_partial: break else: raise yield tuple(result)
from _ _future_ _ import generators def xzip(*iterators): """ Iterative (lazy) version of built-in 'zip' """ iterators = map(iter, iterators) while 1: yield tuple([x.next( ) for x in iterators]) def xmap(func, *iterators): """ Iterative (lazy) version of built-in 'map'. """ iterators = map(iter, iterators) count = len(iterators) def values( ): # map pads shorter sequences with None when they run out of values result = [None]*count some_ok = 0 for i in range(count): if iterators[i] is not None: try: result[i] = iterators[i].next( ) except StopIteration: iterators[i] = None else: some_ok = 1 if some_ok: return tuple(result) else: raise StopIteration while 1: args = values( ) if func is None: yield args else: yield func(*args) def xfilter(func, iterator): """ Iterative version of built-in 'filter' """ iterator = iter(iterator) while 1: next = iterator.next( ) if func(next): yield next def xreduce(func, iterator, default=None): """ Iterative version of built-in 'reduce' """ iterator = iter(iterator) try: prev = iterator.next( ) except StopIteration: return default single = 1 for next in iterator: single = 0 prev = func(prev, next) if single: return func(prev, default) return prev
This recipe is a collection of small utility functions for iterators (all functions can also be used with normal sequences). Among other things, the module presented in this recipe provides generator (lazy) versions of the built-in sequence-manipulation functions. The generators can be combined to produce a more specialized iterator. This recipe requires Python 2.2 or later, of course.
The built-in sequence-manipulation functions zip, map, and filter are specified to return sequences (and the specifications cannot be changed for backward compatibility with versions of Python before 2.2, which lacked iterators); therefore, they cannot become lazy. However, it's easy to write lazy iterator-based versions of these useful functions, as well as other iterator-manipulation functions, as exemplified in this recipe.
Of course, lazy evaluation is not terribly useful in certain cases. The semantics of reduce, for example, require that all of the sequence is evaluated anyway. While in some cases one could save some memory by looping through the sequence that the iterator yields, rather than expanding it, most often it will be more practical to use reduce(func, iterator) instead of the xreduce function presented in this recipe.
Lazy evaluation is most useful when the resulting iterator-represented sequence is used in contexts that may be able to use just a reasonably short prefix of the sequence, such as the zip function and the iterwhile and iterfirst functions in this recipe. In such cases, lazy evaluation enables free use of unbounded sequences (of course, the resulting program will terminate only if each unbounded sequence is used only in a context in which only a finite prefix of it is taken) and sequences of potentially humungous length.
17.13.4 See Also
|I l@ve RuBoard|