Why is python itertools "consume" recipe faster than calling next n times? -
in python documentation itertools provides following "recipe" advancing iterator n steps:
def consume(iterator, n): "advance iterator n-steps ahead. if n none, consume entirely." # use functions consume iterators @ c speed. if n none: # feed entire iterator zero-length deque collections.deque(iterator, maxlen=0) else: # advance empty slice starting @ position n next(islice(iterator, n, n), none)
i'm wondering why recipe fundamentally different (aside handling of consuming whole iterator):
def other_consume(iterable, n): in xrange(n): next(iterable, none)
i used timeit
confirm that, expected, above approach slower. what's going on in recipe allows superior performance? uses islice
, looking @ islice
, appears doing fundamentally same thing code above:
def islice(iterable, *args): s = slice(*args) = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1)) nexti = next(it) ### seems if loop yields iterable n times via enumerate ### how different calling next n times? i, element in enumerate(iterable): if == nexti: yield element nexti = next(it)
note: if instead of importing islice
itertools
define using python equivalent docs shown above, recipe still faster..
edit: timeit
code here:
timeit.timeit('a = iter([random() in xrange(1000000)]); consume(a, 1000000)', setup="from __main__ import consume,random", number=10) timeit.timeit('a = iter([random() in xrange(1000000)]); other_consume(a, 1000000)', setup="from __main__ import other_consume,random", number=10)
other_consume
~ 2.5x slower each time run this
the documentation on itertools.islice()
flawed , doesn't handle edgecase start == stop
properly. edgecase consume()
uses.
for islice(it, n, n)
, n
elements consumed it
nothing ever yielded. instead, stopiteration
raised after n
elements have been consumed.
the python version used test on other hand raises stopiteration
immediately without ever consuming anything it
. makes timings against pure-python version incorrect , way fast.
this because xrange(n, n, 1)
iterator raises stopiteration
:
>>> = iter(xrange(1, 1)) >>> print next(it) traceback (most recent call last): file "prog.py", line 4, in <module> print next(it) stopiteration
Comments
Post a Comment