Python Study 3 - Generators and Others

range vs np.arange vs xrange

range creates a list, so if you do range(1, 10000000) it creates a list in memory with 10000000 elements.
np.arange pretty much like a range function, but it returns an numpy array instead of a list.
xrange is a sequence object is a that evaluates lazily. (much less memory is used).

Efficiency

1
2
3
4
5
6
7
8
%timeit for i in range(1000000): pass
[out] 10 loops, best of 3: 63.6 ms per loop

%timeit for i in np.arange(1000000): pass
[out] 10 loops, best of 3: 158 ms per loop

%timeit for i in xrange(1000000): pass
[out] 10 loops, best of 3: 23.4 ms per loop

The Iteration Protocol

The built-in function iter takes an iterable object and returns an iterator.

1
2
3
4
5
6
7
8
9
10
11
12
13
>>> x = iter([1, 2, 3])
>>> x
<listiterator object at 0x1004ca850>
>>> x.next()
1
>>> x.next()
2
>>> x.next()
3
>>> x.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration

Each time the next method is called on the iterator, it gives the next element. If there are no more elements, it raises a StopIteration.

Generators

Generators simplifies creation of iterators. A generator is a function that produces a sequence of results instead of a single value.

1
2
3
4
5
def yrange(n):
i = 0
while i < n:
yield i
i += 1

Each time the yield statement is executed the function generates a new value.

1
2
3
4
5
6
7
8
9
10
11
12
13
>>> y = yrange(3)
>>> y
<generator object yrange at 0x401f30>
>>> y.next()
0
>>> y.next()
1
>>> y.next()
2
>>> y.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration

So a generator is also an iterator. One does not have to worry about the iterator protocol.

When a generator function is called, it returns a generator object without even beginning execution of the function. When next method is called for the first time, the function starts executing until it reaches yield statement. The yielded value is returned by the next call.
Example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
>>> def foo():
... print "begin"
... for i in range(3):
... print "before yield", i
... yield i
... print "after yield", i
... print "end"
...
>>> f = foo()
>>> f.next()
begin
before yield 0
0
>>> f.next()
after yield 0
before yield 1
1
>>> f.next()
after yield 1
before yield 2
2
>>> f.next()
after yield 2
end
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>>

Generator Expressions
Generator Expressions are generator version of list comprehensions. They look like list comprehensions, but returns a generator back instead of a list.

1
2
3
4
5
>>> a = (x*x for x in range(10))
>>> a
<generator object <genexpr> at 0x401f08>
>>> sum(a)
285

Relationship between Generator, Iterator and Iterable

relationship