# Plan for Today¶

• Talk about comprehensions and how to effectively use them.
• Generators
• Itertools
• File management (see other notebook)

# Comprehensions¶

Provide a readable and effective way of performing a particular expression on a iterable series of items.

The general form of the comprehension:

See here for more details.

## List Comprehensions¶

Using the list literals [] (brackets), we construct a for loop from within.

In [18]:
words = "This is a such a long course".split()
words

Out[18]:
['This', 'is', 'a', 'such', 'a', 'long', 'course']
In [19]:
[len(word) for word in words]

Out[19]:
[4, 2, 1, 4, 1, 4, 6]

List comprehensions are a tool for transforming one list (or any container object in Python3) into another list. This is a syntactic work around for the long standing filter() and map() functions in python.

## Set Comprehensions¶

(New to Python 3)

Using the set literals {}, we construct a for loop from within.

Recall the difference between a set and dictionary is whether there is a key:value pair within the curly brackets. When there is a key:value pair within the brackets, it's a dictionary. When there are only values, it's a set. Use type() if you are unsure.

In [20]:
{len(word) for word in words}

Out[20]:
{1, 2, 4, 6}

## Dictionary Comprehensions¶

(New to Python 3)

Using the set literals {} and assigning a key value pair {key : value}, we construct a for loop from within.

In [21]:
# Create two lists: one full of values and another of equal length full of keys.
list_of_values = [1,2,3,4,5]
list_of_keys = ['a','b','c','d','e']
length_of_the_lists = len(list_of_values)

{list_of_keys[i]:list_of_values[i] for i in range(length_of_the_lists)}

Out[21]:
{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}

## if statements in comprehensions¶

In [22]:
# Quickly produce a series of numbers
[i for i in range(10)]

Out[22]:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [23]:
[i for i in range(10) if i > 5 ]

Out[23]:
[6, 7, 8, 9]

else statements aren't valid in a comprehension, so the code statement needs to be kept simple.

In [24]:
[i  for i in range(10) if i > 5 else "hello"]

  File "<ipython-input-24-27f752950cb6>", line 1
[i  for i in range(10) if i > 5 else "hello"]
^
SyntaxError: invalid syntax


## Conditional Expressions¶

Concise if-then statements

<this_thing> if <this_is_true> else <this_other_thing>

In [41]:
x = 4
"Yes" if x > 5 else "No"

Out[41]:
'No'
In [42]:
x = 6
"Yes" if x > 5 else "No"

Out[42]:
'Yes'
In [43]:
["Yes" if x > 5 else "No" for x in range(10)]

Out[43]:
['No', 'No', 'No', 'No', 'No', 'No', 'Yes', 'Yes', 'Yes', 'Yes']

## Nested comprehensions¶

In [25]:
[i for i in range(5)]

Out[25]:
[0, 1, 2, 3, 4]
In [26]:
[j for j in range(-5,0)]

Out[26]:
[-5, -4, -3, -2, -1]
In [27]:
[[i,j] for i in range(5) for j in range(-5,0)]

Out[27]:
[[0, -5],
[0, -4],
[0, -3],
[0, -2],
[0, -1],
[1, -5],
[1, -4],
[1, -3],
[1, -2],
[1, -1],
[2, -5],
[2, -4],
[2, -3],
[2, -2],
[2, -1],
[3, -5],
[3, -4],
[3, -3],
[3, -2],
[3, -1],
[4, -5],
[4, -4],
[4, -3],
[4, -2],
[4, -1]]

## Speed Boost¶

Comprehensions not only make our code more concise, they also increase the speed of our code

In [28]:
%%timeit
container = []
for i in range(1000):
container.append(i)

89.3 µs ± 1.6 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [29]:
%%timeit
container = [i for i in range(1000)]

38.9 µs ± 382 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


The comprehension expression takes roughly half the time!

## Internal Scope¶

In [30]:
# Say we create a string object letter
letter = 'z'

# Then we use letter as a placeholder
letters = ['a','b','c','d']
for letter in letters:
print(letter)

a
b
c
d

In [31]:
letter # Letter got over written!!!

Out[31]:
'd'

Now let's do the same thing with a comprehension

In [32]:
letter = 'z'
[letter for letter in letters]

Out[32]:
['a', 'b', 'c', 'd']
In [33]:
letter

Out[33]:
'z'

What this means is the list comprehensions offer us more consistency and generate less issues when we arbitrarily assign named values for placeholders when using for loops

# Generators¶

• Specify an iterable sequence that you evaluate lazily (compute on demand).
• All generators are iterators
• can model infinite sequences (such as data streams with no definite end)

Generators are similar to functions; however, rather than use the return keyword, we leverage the yield keyword. If you use the yield keyword once in a function, then that function is a generator.

In [1]:
def gen123():
yield 1
yield 2
yield 3

In [2]:
gen123

Out[2]:
<function __main__.gen123()>
In [9]:
g = gen123() # initiate in an object
g

Out[9]:
<generator object gen123 at 0x1038e6e58>

Behaves just like an iterator; however, the next thing being demanded isn't the next item, but rather the next computation

In [10]:
next(g)

Out[10]:
1
In [11]:
next(g)

Out[11]:
2
In [12]:
next(g)

Out[12]:
3
In [13]:
next(g)

---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-13-e734f8aca5ac> in <module>()
----> 1 next(g)

StopIteration: 

Note that each call to a generator returns a new generator object.

In [16]:
h = gen123()
i = gen123()
h is i

Out[16]:
False

## Generator Comprehensions¶

• simialr syntax to list comprehensions
• creates a generator object
• concise
• lazy evaluation
    (expr(item) for item in iterable)
In [1]:
(i for i in range(10))

Out[1]:
<generator object <genexpr> at 0x1060efb88>
In [2]:
list((i for i in range(10)))

Out[2]:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [3]:
sum(i for i in range(10))

Out[3]:
45

This is really useful if we want to calculate values on demand rather than loading an entire series into memory.

In [4]:
# from 1 to 100,000, how many values are divisible by 13?
sum(1 for i in range(100000) if i%13)

Out[4]:
92307

# Itertools¶

Part of the python standard library. Itertools deals with pythons iterator objects. This provides a robust functionaliy for iterable sequences. Functions in itertools operate on iterators to produce more complex iterators.

We saw two last time when discussing lambda functions: filter() and map().

Some iteration tools produce scalar values, other produce other iterable objects.

## any()¶

In [6]:
any(name == name.title() for name in ["London","New York","Russia",'cat','bus'])

Out[6]:
True
In [11]:
any(len(name) > 10 for name in ["London","New York","Russia",""])

Out[11]:
False

## all()¶

In [12]:
all(name == name.title() for name in ["London","New York","Russia",'cat','bus'])

Out[12]:
False
In [14]:
all(len(name) > 0 for name in ["London","New York","Russia",""])

Out[14]:
False

## sum()¶

In [20]:
sum(i for i in range(100))

Out[20]:
4950

## min()¶

In [22]:
min(i for i in range(100,1000) if i % 17 == 0)

Out[22]:
102

## max()¶

In [23]:
max(i for i in range(100,1000) if i % 17 == 0)

Out[23]:
986

## zip()¶

syncs two series of numbers up into tuples.

In [15]:
a = list(range(10))
b = list(range(-10,0))
zip(a,b) # It's own object type

Out[15]:
<zip at 0x106152748>
In [16]:
dir(zip(a,b))

Out[16]:
['__class__',
'__delattr__',
'__dir__',
'__doc__',
'__eq__',
'__format__',
'__ge__',
'__getattribute__',
'__gt__',
'__hash__',
'__init__',
'__init_subclass__',
'__iter__',
'__le__',
'__lt__',
'__ne__',
'__new__',
'__next__',
'__reduce__',
'__reduce_ex__',
'__repr__',
'__setattr__',
'__sizeof__',
'__str__',
'__subclasshook__']
In [19]:
[item for item in zip(a,b)]

Out[19]:
[(0, -10),
(1, -9),
(2, -8),
(3, -7),
(4, -6),
(5, -5),
(6, -4),
(7, -3),
(8, -2),
(9, -1)]

## enumerate()¶

Generates an index and value tuple pairing

In [29]:
my_list = 'Iterator tools are useful to move across iterable objects in complex ways.'.split()
enumerate(my_list)

Out[29]:
<enumerate at 0x1061ad3f0>
In [30]:
[i for i in enumerate(my_list)]

Out[30]:
[(0, 'Iterator'),
(1, 'tools'),
(2, 'are'),
(3, 'useful'),
(4, 'to'),
(5, 'move'),
(6, 'across'),
(7, 'iterable'),
(8, 'objects'),
(9, 'in'),
(10, 'complex'),
(11, 'ways.')]

# itertools module¶

In [3]:
import itertools


## .combinations()¶

Permutations of all potential combinations

In [48]:
x = ['a','b','c','d']
[i for i in itertools.combinations(x,2)]

Out[48]:
[('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd'), ('c', 'd')]
In [53]:
# Note that we can also unpack an iterable with a constructor
list(itertools.combinations(x,2))

Out[53]:
[('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd'), ('c', 'd')]
In [55]:
# combinations with replacement
list(itertools.combinations_with_replacement(x,4))

Out[55]:
[('a', 'a', 'a', 'a'),
('a', 'a', 'a', 'b'),
('a', 'a', 'a', 'c'),
('a', 'a', 'a', 'd'),
('a', 'a', 'b', 'b'),
('a', 'a', 'b', 'c'),
('a', 'a', 'b', 'd'),
('a', 'a', 'c', 'c'),
('a', 'a', 'c', 'd'),
('a', 'a', 'd', 'd'),
('a', 'b', 'b', 'b'),
('a', 'b', 'b', 'c'),
('a', 'b', 'b', 'd'),
('a', 'b', 'c', 'c'),
('a', 'b', 'c', 'd'),
('a', 'b', 'd', 'd'),
('a', 'c', 'c', 'c'),
('a', 'c', 'c', 'd'),
('a', 'c', 'd', 'd'),
('a', 'd', 'd', 'd'),
('b', 'b', 'b', 'b'),
('b', 'b', 'b', 'c'),
('b', 'b', 'b', 'd'),
('b', 'b', 'c', 'c'),
('b', 'b', 'c', 'd'),
('b', 'b', 'd', 'd'),
('b', 'c', 'c', 'c'),
('b', 'c', 'c', 'd'),
('b', 'c', 'd', 'd'),
('b', 'd', 'd', 'd'),
('c', 'c', 'c', 'c'),
('c', 'c', 'c', 'd'),
('c', 'c', 'd', 'd'),
('c', 'd', 'd', 'd'),
('d', 'd', 'd', 'd')]

## .permutations()¶

In [56]:
list(itertools.permutations(x))

Out[56]:
[('a', 'b', 'c', 'd'),
('a', 'b', 'd', 'c'),
('a', 'c', 'b', 'd'),
('a', 'c', 'd', 'b'),
('a', 'd', 'b', 'c'),
('a', 'd', 'c', 'b'),
('b', 'a', 'c', 'd'),
('b', 'a', 'd', 'c'),
('b', 'c', 'a', 'd'),
('b', 'c', 'd', 'a'),
('b', 'd', 'a', 'c'),
('b', 'd', 'c', 'a'),
('c', 'a', 'b', 'd'),
('c', 'a', 'd', 'b'),
('c', 'b', 'a', 'd'),
('c', 'b', 'd', 'a'),
('c', 'd', 'a', 'b'),
('c', 'd', 'b', 'a'),
('d', 'a', 'b', 'c'),
('d', 'a', 'c', 'b'),
('d', 'b', 'a', 'c'),
('d', 'b', 'c', 'a'),
('d', 'c', 'a', 'b'),
('d', 'c', 'b', 'a')]

## .count()¶

Creates a count generator.

In [5]:
counter = itertools.count(start=0,step=.3)

In [6]:
next(counter)

Out[6]:
0
In [7]:
next(counter)

Out[7]:
0.3
In [8]:
next(counter)

Out[8]:
0.6
In [10]:
list(zip(itertools.count(step=5),"Georgetown"))

Out[10]:
[(0, 'G'),
(5, 'e'),
(10, 'o'),
(15, 'r'),
(20, 'g'),
(25, 'e'),
(30, 't'),
(35, 'o'),
(40, 'w'),
(45, 'n')]

## .repeat()¶

In [15]:
list(itertools.repeat("a",10))

Out[15]:
['a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a']

## .chain()¶

lazily concatenate lists together without the memory overhead of duplication.

In [17]:
list(itertools.chain('ABC', 'DEF'))

Out[17]:
['A', 'B', 'C', 'D', 'E', 'F']

## .islice()¶

Slices like we would normally do on a list, but does so as an iterator.

In [23]:
[i for i in itertools.islice('abcd',2)]

Out[23]:
['a', 'b']