Why is python itertools "consume" recipe faster than calling next n times? -

April 15, 2013

in python documentation itertools provides following "recipe" advancing iterator n steps:

def consume(iterator, n):     "advance iterator n-steps ahead. if n none, consume entirely."     # use functions consume iterators @ c speed.     if n none:         # feed entire iterator zero-length deque         collections.deque(iterator, maxlen=0)     else:         # advance empty slice starting @ position n         next(islice(iterator, n, n), none)

i'm wondering why recipe fundamentally different (aside handling of consuming whole iterator):

def other_consume(iterable, n):     in xrange(n):         next(iterable, none)

i used timeit confirm that, expected, above approach slower. what's going on in recipe allows superior performance? uses islice, looking @ islice, appears doing fundamentally same thing code above:

def islice(iterable, *args):     s = slice(*args)     = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))     nexti = next(it)     ### seems if loop yields iterable n times via enumerate     ### how different calling next n times?     i, element in enumerate(iterable):          if == nexti:             yield element             nexti = next(it)

note: if instead of importing islice itertools define using python equivalent docs shown above, recipe still faster..

edit: timeit code here:

timeit.timeit('a = iter([random() in xrange(1000000)]); consume(a, 1000000)', setup="from __main__ import consume,random", number=10) timeit.timeit('a = iter([random() in xrange(1000000)]); other_consume(a, 1000000)', setup="from __main__ import other_consume,random", number=10)

other_consume ~ 2.5x slower each time run this

the documentation on itertools.islice() flawed , doesn't handle edgecase start == stop properly. edgecase consume() uses.

for islice(it, n, n), n elements consumed it nothing ever yielded. instead, stopiteration raised after n elements have been consumed.

the python version used test on other hand raises stopiteration immediately without ever consuming anything it. makes timings against pure-python version incorrect , way fast.

this because xrange(n, n, 1) iterator raises stopiteration:

>>> = iter(xrange(1, 1)) >>> print next(it) traceback (most recent call last):   file "prog.py", line 4, in <module>     print next(it) stopiteration

Search This Blog

Three

Why is python itertools "consume" recipe faster than calling next n times? -

Comments

Post a Comment

Popular posts from this blog

Socket.connect doesn't throw exception in Android -

SPSS keyboard combination alters encoding -

iphone - How do I keep MDScrollView from truncating my row headers and making my cells look bad? -