PPOLS564: Foundations of Data Science

Lecture 13

Matrix Operations

Concepts For today:

  • Recap
    • Matrix Multiplication
    • Matrix Addition & Subtraction
  • Transposing Matrices
  • Different Types of Matrices
In [1]:
import numpy as np

Multiplying Matrices

Matrix multiplication can be thought of as a transformation/function.


$$f(\vec{x}) = \textbf{A}\vec{x}$$

In [2]:
# an arbitrary vector in R^2
x = np.array([1,2])

# a function that transforms the vector
def f(x): 
    return np.array([x[0] - x[1],3*x[1]])

# a matrix that performs the same transformation
A = np.array([[ 1., -1.],[ 0.,  3.]])
In [3]:
f(x)
Out[3]:
array([-1,  6])
In [4]:
A.dot(x)
Out[4]:
array([-1.,  6.])

Multiplying matrices is equivalent to nesting two functions.



$$ f(g(\vec{x}_{2x1})) = \textbf{A}_{2x2}\textbf{B}_{2x2} \vec{x}_{2x1} = \vec{z}_{2x1}$$



That is, it's the same as performing the operation independently.



$$ \textbf{B}_{2x2}\vec{x} = \vec{y}_{2x1}$$

$$ \textbf{A}_{2x2}\vec{y} = \vec{z}_{2x1}$$



Which is the same as



$$ g(\vec{x}_{2x1}) = \vec{y}_{2x1}$$

$$ f(\vec{y}_{2x1}) = \vec{z}_{2x1} $$



That is, we transform $\vec{x}$ by $\textbf{B}$ and then transform that resulting vector by $\textbf{A}$ much as we would with the nested function $f(g(\vec{x}))$.

In [5]:
B = np.array([[-3. ,  1. ],[ 0.5,  2.3]])
B
Out[5]:
array([[-3. ,  1. ],
       [ 0.5,  2.3]])
In [6]:
# Multiplying to conforming matrices and then multiplying the vector
A.dot(B.dot(x))
Out[6]:
array([-6.1, 15.3])
In [7]:
# Is the same as doing each step independently
y = B.dot(x)
z = A.dot(y)
z
Out[7]:
array([-6.1, 15.3])

Properties of Matrix Multiplication

COMMUNITIVE

$$ \textbf{A} \textbf{B} \ne \textbf{B} \textbf{A} $$

ASSOCIATIVE

$$(\textbf{A} \textbf{B}) \textbf{C} = \textbf{A} (\textbf{B} \textbf{C}) = \textbf{A} \textbf{B} \textbf{C} $$

DISTRIBUTIVE

$$\textbf{A}(\textbf{B} + \textbf{C}) = \textbf{A}\textbf{B} + \textbf{A}\textbf{C}$$

But remember it's not communicative, so order matters!

Matrix Addition and Substitution

Much like vectors, multiply and adding vectors is done so element-wise.

$$\textbf{B}_{3x2} = \begin{bmatrix} 2 & 1 \\ -1 & -2 \\ 4 & 3 \\ \end{bmatrix} $$

$$\textbf{C}_{3x2} = \begin{bmatrix} 1 & 2 \\ -2 & 1 \\ 2 & 1 \\ \end{bmatrix} $$

In [8]:
B = np.array([[2,1],[-1,-2],[4,3]])
B
Out[8]:
array([[ 2,  1],
       [-1, -2],
       [ 4,  3]])
In [9]:
C = np.array([[1,2],[-2,1],[2,1]])
C
Out[9]:
array([[ 1,  2],
       [-2,  1],
       [ 2,  1]])

Addition

$$ \textbf{B}_{3x2} + \textbf{C}_{3x2} $$

$$ \begin{bmatrix} 2 & 1 \\ -1 & -2 \\ 4 & 3 \\ \end{bmatrix} + \begin{bmatrix} 1 & 2 \\ -2 & 1 \\ 2 & 1 \\ \end{bmatrix} $$

$$ \begin{bmatrix} 2 + 1 & 1 + 2 \\ -1 + -2 & -2 + 1 \\ 4 + 2& 3 + 1\\ \end{bmatrix} $$

$$ \begin{bmatrix} 3 & 3 \\ -3 & -1 \\ 6 & 4\\ \end{bmatrix} $$

In [10]:
B + C
Out[10]:
array([[ 3,  3],
       [-3, -1],
       [ 6,  4]])

Subtraction

$$ \textbf{B}_{3x2} + \textbf{C}_{3x2} $$

$$ \begin{bmatrix} 2 & 1 \\ -1 & -2 \\ 4 & 3 \\ \end{bmatrix} + \begin{bmatrix} 1 & 2 \\ -2 & 1 \\ 2 & 1 \\ \end{bmatrix} $$

$$ \begin{bmatrix} 2 - 1 & 1 - 2 \\ -1 - -2 & -2 - 1 \\ 4 - 2& 3 - 1\\ \end{bmatrix} $$

$$ \begin{bmatrix} 1 & -1 \\ 1 & -3 \\ 2 & 2\\ \end{bmatrix} $$

In [11]:
B - C
Out[11]:
array([[ 1, -1],
       [ 1, -3],
       [ 2,  2]])

Must have corresponding elements

In [12]:
D = np.array([[1,2],[2,4]])
D
Out[12]:
array([[1, 2],
       [2, 4]])
In [13]:
B - D
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-13-7135e4a29578> in <module>()
----> 1 B - D

ValueError: operands could not be broadcast together with shapes (3,2) (2,2) 

Transposing a Matrix

$$\textbf{A}_{2x3} = \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ \end{bmatrix} $$

$$\textbf{A}^T_{3x2} = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22}\\ a_{31} & a_{32}\\ \end{bmatrix} $$

In [ ]:
A = np.array([[1,2,3],
             [4,5,6]])
A
In [ ]:
A.T

Properties

$$ (\textbf{A}^T)^T = A $$

$$ (\textbf{A} + \textbf{B})^T = \textbf{A}^T + \textbf{B}^T $$

$$ (c\textbf{A})^T = cA^T $$

$$ (\textbf{A}\textbf{B})^T = \textbf{A}^T \textbf{B}^T $$

"Squaring" a matrix: Sum of Squares

Recall that to multiply two matrices, their rows and columns must correspond. We can manufacture this condition by taking the dot product of a matrix transposed with itself.

In [14]:
A
Out[14]:
array([[ 1., -1.],
       [ 0.,  3.]])
In [15]:
At = A.T
At
Out[15]:
array([[ 1.,  0.],
       [-1.,  3.]])
In [16]:
A.dot(At)
Out[16]:
array([[ 2., -3.],
       [-3.,  9.]])
In [17]:
At.dot(A)
Out[17]:
array([[ 1., -1.],
       [-1., 10.]])

What is going on here?



$$ \textbf{A}_{2x3} \textbf{A}^T_{3x2} $$



$$\begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ \end{bmatrix} \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22}\\ a_{31} & a_{32}\\ \end{bmatrix} $$



$$ \begin{bmatrix} a_{11}a_{11} + a_{12}a_{21} + a_{13}a_{31} & a_{11}a_{11} + a_{12}a_{21} + a_{13}a_{31}\\ a_{21}a_{11} + a_{22}a_{21} + a_{23}a_{31} & a_{21}a_{12} + a_{22}a_{22} + a_{23}a_{32}\\ \end{bmatrix} $$



$$ \begin{bmatrix}a & b\\ c & d\\ \end{bmatrix} $$



With numbers this time ...



$$\begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ \end{bmatrix} \begin{bmatrix} 1 & 4\\ 2 & 5\\ 3 & 6\\ \end{bmatrix} $$



$$ \begin{bmatrix} 1(1) + (2)(2) + (3)(3) & 1(4) + 2(5) + 3(6)\\ 4(1) + 5(2) + 3(6) & 4(4) + (5)(5) + (6)(6)\\ \end{bmatrix} $$



$$ \begin{bmatrix}14 & 32\\ 32& 77\\ \end{bmatrix} $$



Given what we know about vector dot products...



$$ \begin{bmatrix} length & projection\\ projection & length\\ \end{bmatrix} $$



In other words, a matrix dotted by its transpose generates a sum of squares.

In [18]:
# Consider what the squared matrix would look like if the
# colunn vectors are orthogonal

G = np.array([[4,0],
              [0,17]])


# They don't project onto one another. 
G.dot(G.T)
Out[18]:
array([[ 16,   0],
       [  0, 289]])

Different Types of Matrices

In [19]:
X = np.random.randn(25).reshape(5,5).round(1)
X
Out[19]:
array([[-0.5, -0.4, -0. , -0.7, -0.7],
       [ 0.6,  1.5, -2.1,  0.8, -0.8],
       [ 1.6, -0.4,  0.9,  0.8, -0.8],
       [ 0.2,  0.3,  1. , -0.9,  0.6],
       [ 0.7,  0.1,  0.4,  0. ,  0.7]])

Symmetric Matrices

In [20]:
X.dot(X.T).round(1)
Out[20]:
array([[ 1.4, -0.9, -0.6, -0. , -0.9],
       [-0.9,  8.3, -0.2, -2.7, -0.8],
       [-0.6, -0.2,  4.8, -0.1,  0.9],
       [-0. , -2.7, -0.1,  2.3,  1. ],
       [-0.9, -0.8,  0.9,  1. ,  1.2]])

Upper Triangle Matrices

In [21]:
np.triu(X)
Out[21]:
array([[-0.5, -0.4, -0. , -0.7, -0.7],
       [ 0. ,  1.5, -2.1,  0.8, -0.8],
       [ 0. ,  0. ,  0.9,  0.8, -0.8],
       [ 0. ,  0. ,  0. , -0.9,  0.6],
       [ 0. ,  0. ,  0. ,  0. ,  0.7]])

Lower Triangle Matrices

In [22]:
# Lower Triangle
np.tril(X)
Out[22]:
array([[-0.5,  0. ,  0. ,  0. ,  0. ],
       [ 0.6,  1.5,  0. ,  0. ,  0. ],
       [ 1.6, -0.4,  0.9,  0. ,  0. ],
       [ 0.2,  0.3,  1. , -0.9,  0. ],
       [ 0.7,  0.1,  0.4,  0. ,  0.7]])

Diagonal Matrices

In [23]:
np.diag(np.array([4,2,10,-1]))
Out[23]:
array([[ 4,  0,  0,  0],
       [ 0,  2,  0,  0],
       [ 0,  0, 10,  0],
       [ 0,  0,  0, -1]])

Zero Matrices

In [24]:
np.zeros((5,5))
Out[24]:
array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

Idempotent Matrices

A matrix that when multiplied with itself yields itself.

In [25]:
P = np.array([[2,-2,-4],[-1,3,4],[1,-2,-3]])
P
Out[25]:
array([[ 2, -2, -4],
       [-1,  3,  4],
       [ 1, -2, -3]])
In [26]:
P.dot(P)
Out[26]:
array([[ 2, -2, -4],
       [-1,  3,  4],
       [ 1, -2, -3]])

Identity Matrix

In [27]:
I = np.eye(5)
I
Out[27]:
array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

Note that an identity matrix is also a diagonal and idempotent matrix.

In [28]:
I.dot(I)
Out[28]:
array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

A matrix multiplied by by the identity matrix returns the original matrix.

In [29]:
I.dot(X)
Out[29]:
array([[-0.5, -0.4,  0. , -0.7, -0.7],
       [ 0.6,  1.5, -2.1,  0.8, -0.8],
       [ 1.6, -0.4,  0.9,  0.8, -0.8],
       [ 0.2,  0.3,  1. , -0.9,  0.6],
       [ 0.7,  0.1,  0.4,  0. ,  0.7]])

Sparse Matrices

In [30]:
X = np.zeros((10,10))
X[[1,4,6,5,2],[1,5,3,5,1]] = 1
X
Out[30]:
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
In [31]:
np.nonzero(X)
Out[31]:
(array([1, 2, 4, 5, 6]), array([1, 1, 5, 5, 3]))

Save as an adjacency list

In [32]:
from scipy import sparse
print(sparse.csc_matrix(X))
  (1, 1)	1.0
  (2, 1)	1.0
  (6, 3)	1.0
  (4, 5)	1.0
  (5, 5)	1.0