Why is python itertools "consume" recipe faster than calling next n times? -


in python documentation itertools provides following "recipe" advancing iterator n steps:

def consume(iterator, n):     "advance iterator n-steps ahead. if n none, consume entirely."     # use functions consume iterators @ c speed.     if n none:         # feed entire iterator zero-length deque         collections.deque(iterator, maxlen=0)     else:         # advance empty slice starting @ position n         next(islice(iterator, n, n), none) 

i'm wondering why recipe fundamentally different (aside handling of consuming whole iterator):

def other_consume(iterable, n):     in xrange(n):         next(iterable, none) 

i used timeit confirm that, expected, above approach slower. what's going on in recipe allows superior performance? uses islice, looking @ islice, appears doing fundamentally same thing code above:

def islice(iterable, *args):     s = slice(*args)     = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))     nexti = next(it)     ### seems if loop yields iterable n times via enumerate     ### how different calling next n times?     i, element in enumerate(iterable):          if == nexti:             yield element             nexti = next(it) 

note: if instead of importing islice itertools define using python equivalent docs shown above, recipe still faster..

edit: timeit code here:

timeit.timeit('a = iter([random() in xrange(1000000)]); consume(a, 1000000)', setup="from __main__ import consume,random", number=10) timeit.timeit('a = iter([random() in xrange(1000000)]); other_consume(a, 1000000)', setup="from __main__ import other_consume,random", number=10) 

other_consume ~ 2.5x slower each time run this

the documentation on itertools.islice() flawed , doesn't handle edgecase start == stop properly. edgecase consume() uses.

for islice(it, n, n), n elements consumed it nothing ever yielded. instead, stopiteration raised after n elements have been consumed.

the python version used test on other hand raises stopiteration immediately without ever consuming anything it. makes timings against pure-python version incorrect , way fast.

this because xrange(n, n, 1) iterator raises stopiteration:

>>> = iter(xrange(1, 1)) >>> print next(it) traceback (most recent call last):   file "prog.py", line 4, in <module>     print next(it) stopiteration 

Comments

Popular posts from this blog

SPSS keyboard combination alters encoding -

Add new record to the table by click on the button in Microsoft Access -

javascript - jQuery .height() return 0 when visible but non-0 when hidden -