PPOLS564: Foundations of Data Science

Lecture 13

Matrix Operations

Concepts For today:¶

Recap
- Matrix Multiplication
- Matrix Addition & Subtraction
Transposing Matrices
Different Types of Matrices

import numpy as np

Multiplying Matrices¶

Matrix multiplication can be thought of as a transformation/function.

$$f(\vec{x}) = \textbf{A}\vec{x}$$

# an arbitrary vector in R^2
x = np.array([1,2])

# a function that transforms the vector
def f(x): 
    return np.array([x[0] - x[1],3*x[1]])

# a matrix that performs the same transformation
A = np.array([[ 1., -1.],[ 0.,  3.]])

f(x)

array([-1,  6])

A.dot(x)

array([-1.,  6.])

Multiplying matrices is equivalent to nesting two functions.

$$ f(g(\vec{x}_{2x1})) = \textbf{A}_{2x2}\textbf{B}_{2x2} \vec{x}_{2x1} = \vec{z}_{2x1}$$

That is, it's the same as performing the operation independently.

$$ \textbf{B}_{2x2}\vec{x} = \vec{y}_{2x1}$$

$$ \textbf{A}_{2x2}\vec{y} = \vec{z}_{2x1}$$

Which is the same as

$$ g(\vec{x}_{2x1}) = \vec{y}_{2x1}$$

$$ f(\vec{y}_{2x1}) = \vec{z}_{2x1} $$

That is, we transform $\vec{x}$ by $\textbf{B}$ and then transform that resulting vector by $\textbf{A}$ much as we would with the nested function $f(g(\vec{x}))$.

B = np.array([[-3. ,  1. ],[ 0.5,  2.3]])
B

array([[-3. ,  1. ],
       [ 0.5,  2.3]])

# Multiplying to conforming matrices and then multiplying the vector
A.dot(B.dot(x))

array([-6.1, 15.3])

# Is the same as doing each step independently
y = B.dot(x)
z = A.dot(y)
z

array([-6.1, 15.3])

Properties of Matrix Multiplication¶

~~COMMUNITIVE~~

$$ \textbf{A} \textbf{B} \ne \textbf{B} \textbf{A} $$

ASSOCIATIVE

$$(\textbf{A} \textbf{B}) \textbf{C} = \textbf{A} (\textbf{B} \textbf{C}) = \textbf{A} \textbf{B} \textbf{C} $$

DISTRIBUTIVE

$$\textbf{A}(\textbf{B} + \textbf{C}) = \textbf{A}\textbf{B} + \textbf{A}\textbf{C}$$

But remember it's not communicative, so order matters!

Matrix Addition and Substitution¶

Much like vectors, multiply and adding vectors is done so element-wise.

$$\textbf{B}_{3x2} = \begin{bmatrix} 2 & 1 \\ -1 & -2 \\ 4 & 3 \\ \end{bmatrix} $$

$$\textbf{C}_{3x2} = \begin{bmatrix} 1 & 2 \\ -2 & 1 \\ 2 & 1 \\ \end{bmatrix} $$

B = np.array([[2,1],[-1,-2],[4,3]])
B

array([[ 2,  1],
       [-1, -2],
       [ 4,  3]])

C = np.array([[1,2],[-2,1],[2,1]])
C

array([[ 1,  2],
       [-2,  1],
       [ 2,  1]])

Addition¶

$$ \textbf{B}_{3x2} + \textbf{C}_{3x2} $$

$$ \begin{bmatrix} 2 & 1 \\ -1 & -2 \\ 4 & 3 \\ \end{bmatrix} + \begin{bmatrix} 1 & 2 \\ -2 & 1 \\ 2 & 1 \\ \end{bmatrix} $$

$$ \begin{bmatrix} 2 + 1 & 1 + 2 \\ -1 + -2 & -2 + 1 \\ 4 + 2& 3 + 1\\ \end{bmatrix} $$

$$ \begin{bmatrix} 3 & 3 \\ -3 & -1 \\ 6 & 4\\ \end{bmatrix} $$

B + C

array([[ 3,  3],
       [-3, -1],
       [ 6,  4]])

Subtraction¶

$$ \textbf{B}_{3x2} + \textbf{C}_{3x2} $$

$$ \begin{bmatrix} 2 & 1 \\ -1 & -2 \\ 4 & 3 \\ \end{bmatrix} + \begin{bmatrix} 1 & 2 \\ -2 & 1 \\ 2 & 1 \\ \end{bmatrix} $$

$$ \begin{bmatrix} 2 - 1 & 1 - 2 \\ -1 - -2 & -2 - 1 \\ 4 - 2& 3 - 1\\ \end{bmatrix} $$

$$ \begin{bmatrix} 1 & -1 \\ 1 & -3 \\ 2 & 2\\ \end{bmatrix} $$

B - C

array([[ 1, -1],
       [ 1, -3],
       [ 2,  2]])

Must have corresponding elements¶

D = np.array([[1,2],[2,4]])
D

array([[1, 2],
       [2, 4]])

B - D

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-13-7135e4a29578> in <module>()
----> 1 B - D

ValueError: operands could not be broadcast together with shapes (3,2) (2,2)

Transposing a Matrix¶

$$\textbf{A}_{2x3} = \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ \end{bmatrix} $$

$$\textbf{A}^T_{3x2} = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22}\\ a_{31} & a_{32}\\ \end{bmatrix} $$

A = np.array([[1,2,3],
             [4,5,6]])
A

A.T

Properties¶

$$ (\textbf{A}^T)^T = A $$

$$ (\textbf{A} + \textbf{B})^T = \textbf{A}^T + \textbf{B}^T $$

$$ (c\textbf{A})^T = cA^T $$

$$ (\textbf{A}\textbf{B})^T = \textbf{A}^T \textbf{B}^T $$

"Squaring" a matrix: Sum of Squares¶

Recall that to multiply two matrices, their rows and columns must correspond. We can manufacture this condition by taking the dot product of a matrix transposed with itself.

A

array([[ 1., -1.],
       [ 0.,  3.]])

At = A.T
At

array([[ 1.,  0.],
       [-1.,  3.]])

A.dot(At)

array([[ 2., -3.],
       [-3.,  9.]])

At.dot(A)

array([[ 1., -1.],
       [-1., 10.]])

What is going on here?

$$ \textbf{A}_{2x3} \textbf{A}^T_{3x2} $$

$$\begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ \end{bmatrix} \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22}\\ a_{31} & a_{32}\\ \end{bmatrix} $$

$$ \begin{bmatrix} a_{11}a_{11} + a_{12}a_{21} + a_{13}a_{31} & a_{11}a_{11} + a_{12}a_{21} + a_{13}a_{31}\\ a_{21}a_{11} + a_{22}a_{21} + a_{23}a_{31} & a_{21}a_{12} + a_{22}a_{22} + a_{23}a_{32}\\ \end{bmatrix} $$

$$ \begin{bmatrix}a & b\\ c & d\\ \end{bmatrix} $$

With numbers this time ...

$$\begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ \end{bmatrix} \begin{bmatrix} 1 & 4\\ 2 & 5\\ 3 & 6\\ \end{bmatrix} $$

$$ \begin{bmatrix} 1(1) + (2)(2) + (3)(3) & 1(4) + 2(5) + 3(6)\\ 4(1) + 5(2) + 3(6) & 4(4) + (5)(5) + (6)(6)\\ \end{bmatrix} $$

$$ \begin{bmatrix}14 & 32\\ 32& 77\\ \end{bmatrix} $$

Given what we know about vector dot products...

$$ \begin{bmatrix} length & projection\\ projection & length\\ \end{bmatrix} $$

In other words, a matrix dotted by its transpose generates a sum of squares.

# Consider what the squared matrix would look like if the
# colunn vectors are orthogonal

G = np.array([[4,0],
              [0,17]])


# They don't project onto one another. 
G.dot(G.T)

array([[ 16,   0],
       [  0, 289]])

Different Types of Matrices¶

X = np.random.randn(25).reshape(5,5).round(1)
X

array([[-0.5, -0.4, -0. , -0.7, -0.7],
       [ 0.6,  1.5, -2.1,  0.8, -0.8],
       [ 1.6, -0.4,  0.9,  0.8, -0.8],
       [ 0.2,  0.3,  1. , -0.9,  0.6],
       [ 0.7,  0.1,  0.4,  0. ,  0.7]])

Symmetric Matrices¶

X.dot(X.T).round(1)

array([[ 1.4, -0.9, -0.6, -0. , -0.9],
       [-0.9,  8.3, -0.2, -2.7, -0.8],
       [-0.6, -0.2,  4.8, -0.1,  0.9],
       [-0. , -2.7, -0.1,  2.3,  1. ],
       [-0.9, -0.8,  0.9,  1. ,  1.2]])

Upper Triangle Matrices¶

np.triu(X)

array([[-0.5, -0.4, -0. , -0.7, -0.7],
       [ 0. ,  1.5, -2.1,  0.8, -0.8],
       [ 0. ,  0. ,  0.9,  0.8, -0.8],
       [ 0. ,  0. ,  0. , -0.9,  0.6],
       [ 0. ,  0. ,  0. ,  0. ,  0.7]])

Lower Triangle Matrices¶

# Lower Triangle
np.tril(X)

array([[-0.5,  0. ,  0. ,  0. ,  0. ],
       [ 0.6,  1.5,  0. ,  0. ,  0. ],
       [ 1.6, -0.4,  0.9,  0. ,  0. ],
       [ 0.2,  0.3,  1. , -0.9,  0. ],
       [ 0.7,  0.1,  0.4,  0. ,  0.7]])

Diagonal Matrices¶

np.diag(np.array([4,2,10,-1]))

array([[ 4,  0,  0,  0],
       [ 0,  2,  0,  0],
       [ 0,  0, 10,  0],
       [ 0,  0,  0, -1]])

Zero Matrices¶

np.zeros((5,5))

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

Idempotent Matrices¶

A matrix that when multiplied with itself yields itself.

P = np.array([[2,-2,-4],[-1,3,4],[1,-2,-3]])
P

array([[ 2, -2, -4],
       [-1,  3,  4],
       [ 1, -2, -3]])

P.dot(P)

array([[ 2, -2, -4],
       [-1,  3,  4],
       [ 1, -2, -3]])

Identity Matrix¶

I = np.eye(5)
I

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

Note that an identity matrix is also a diagonal and idempotent matrix.

I.dot(I)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

A matrix multiplied by by the identity matrix returns the original matrix.

I.dot(X)

array([[-0.5, -0.4,  0. , -0.7, -0.7],
       [ 0.6,  1.5, -2.1,  0.8, -0.8],
       [ 1.6, -0.4,  0.9,  0.8, -0.8],
       [ 0.2,  0.3,  1. , -0.9,  0.6],
       [ 0.7,  0.1,  0.4,  0. ,  0.7]])

Sparse Matrices¶

X = np.zeros((10,10))
X[[1,4,6,5,2],[1,5,3,5,1]] = 1
X

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

np.nonzero(X)

(array([1, 2, 4, 5, 6]), array([1, 1, 5, 5, 3]))

Save as an adjacency list

from scipy import sparse
print(sparse.csc_matrix(X))

  (1, 1)	1.0
  (2, 1)	1.0
  (6, 3)	1.0
  (4, 5)	1.0
  (5, 5)	1.0