Have you ever wondered why you need to know matrix operations? The answer is deadly simple: ** To work with matrices**!

**Above all**, I assume you already know the importance of linear algebra in Machine Learning and you are familiar with the basic definitions. Therefore, I do not need to talk about why it is important to know the matrix operations.

**Do I?**In this tutorial, I will explain the most

**that we**

*important matrix operations**desperately need*and

*frequently encounter*in Machine Learning.

*Here’s what you will learn here:*

- The
such a matrix transpose, multiplication, and inversion.*core matrix operations* - For each of the operations, you will learn
.*how to implement them in Python* - I will explain the
of the explain operations.*properties*

The assumption is that you are somehow familiar with Python or you are in the process of learning. ** If you would like to learn Python the easy way**, you can check my YouTube course online which is freely available for all.

*However, you do NOT need to know Python to understand the concepts presented in this article.*### Matrix Transpose

The transpose of a matrix is an operator which switches the row and column indices of the matrix. As a result, after transposing a matrix, we have a new matrix and is denoted as . Therefore, the row, column elements of are the row and column elements of , respectively. Assume we show with matrix , then we have the following:

We can have the following example to clarify better:

Calculation of a matrix transpose is deadly easy with Python. For instance, you can try the following code:

# Use Numpy package import numpy as np # Define a 3x2 matrix using np.array A = np.array([[1, 2.2], [4, 7], [8, -2]]) """A = [[ 1. 2.2] [ 4. 7. ] [ 8. -2. ]] """ print("A is: {}".format(A)) print("The shape of A is: {}".format(A.shape)) # Use transpose() method B = A.transpose() """B = [[ 1. 4. 8. ] [ 2.2 7. -2. ]]""" print("B is: {}".format(B)) print("The shape of B is: {}".format(B.shape))

You should get the following output:

A is: [[ 1. 2.2] [ 4. 7. ] [ 8. -2. ]] The shape of A is: (3, 2) B is: [[ 1. 4. 8. ] [ 2.2 7. -2. ]] The shape of B is: (2, 3)

**NOTE:** It is worth noting that .

### Identity Matrix

Identity matrix, which is denoted as , is a square matrix (number of rows = number of columns = ) that all its elements along the main diagonal are 1’s and the other elements are zero. For example, we can have the following identity matrix:

You can create the above matrix using the following code:

# Use Numpy package import numpy as np # Define an Identity matrix # Ref: https://docs.scipy.org/doc/numpy/reference/generated/numpy.eye.html A = np.eye(4)

### Adding Operation

Adding two matrices is possible only if both matrices ** have the same shape**. Assume . Let’s get back to Python and define the same two matrices defined above. After that, we will add them together:

# Use Numpy package import numpy as np # Define a 3x2 matrix using np.array A = np.array([[1, 2.2], [4, 7], [8, -2]]) # Use transpose() method B = A.transpose() # Create a matrix similar to A in shape but filled with random numbers # Use *A.shape argument A_like = np.random.randn(*A.shape) # Add two matrices of the same shape M = A + A_like print("M equals to: ", M) # Add two matrices with different shape C = A + B print("C equals to: ", C)

You should get the following output:

M equals to: [[ 2.4212905 2.88158481] [ 4.34872344 5.01038501] [ 7.58194231 -2.0192284 ]] --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-35-293b043ce9a9> in <module>() 16 17 # Add two matrices with different shape ---> 18 C = A + B 19 print("C equals to: ", C) ValueError: operands could not be broadcast together with shapes (3,2) (2,3)

There are some important characteristics for adding matrices that you do not want to miss:

You can simply confirm the above properties with the following Python code:

# Use Numpy package import numpy as np # Define random 3x4 matrix using np.array # Ref: https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.randint.html A = np.random.randint(10, size=(3, 4)) B = np.random.randint(10, size=(3, 4)) C = np.random.randint(10, size=(3, 4)) # np.all() test whether all array elements are True. # More info: https://docs.scipy.org/doc/numpy/reference/generated/numpy.all.html checkProperty = np.all(A + B == B + A) if checkProperty: print('Property A + B == B + A is confirmed!') checkProperty = np.all((A + B) + C == A + (B + C)) if checkProperty: print('Property (A + B) + C == A + (B + C) is confirmed!') checkProperty = np.all(A + 0 == 0 + A) if checkProperty: print('Property A + 0 == 0 + A is confirmed!') checkProperty = np.all((A + B).transpose() == A.transpose() + B.transpose()) if checkProperty: print('Property (A + B)^T == A^T + B^T is confirmed!')

### Scalar Multiplication

If we multiply a matrix by a number (a.k.a. scalar), the result equals to multipling every entry of the matrix by that specific scalar. For example, you can check the following calculations:

The properties of scalar and matrix multiplication are as below:

### Matrix Multiplication

Assuming we like to multiply to matrices and calculate the output as . For doing so, the dimension of and matrices should match. ** Here, being matched does not mean being equal**. Matching means the number of

**of should equal the number of**

*columns***in . An example is to assume the shape of equal to . Then, the shape of MUST be , so is shared. As a result of such multiplication, the shape of equal to .**

*rows*Check the following Python code to have a better understanding of matrix multiplication:

# Use Numpy package import numpy as np # Define two 3x4 random matrices using np.array # Ref: https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.randint.html A = np.random.randint(10, size=(3, 4)) B = np.random.randint(10, size=(3, 4)) print('Shape A is: {}'.format(A.shape)) print('Shape B is: {}'.format(B.shape)) # Calculate the number of colums in A and number of rows in B A_num_columns = A.shape[1] B_num_rows = B.shape[0] # Check the dimensions if A_num_columns != B_num_rows: print('dimension mismatch') # You should get an error as A_num_columns != B_num_rows C = np.matmul(A , B)

The output will be:

Shape A is: (3, 4) Shape B is: (3, 4) dimension mismatch --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-38-dc16b8d9d74a> in <module>() 17 18 # You should get an error as A_num_columns != B_num_columns ---> 19 C = np.matmul(A , B) ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 4)

Mathematically speaking, we can calculate different elements of the as below:

An illustrative example is shown below:

You can implement the above example with the follwoing code and confirm the results:

# Use Numpy package import numpy as np # Define two matrices A = np.array([[2,3,-1],[7,0,2]]) B = np.array([[2,1],[2,-1],[1,5]]) print('Shape A is: {}'.format(A.shape)) print('Shape B is: {}'.format(B.shape)) # Calculate the number of colums in A and number of rows in B A_num_columns = A.shape[1] B_num_rows = B.shape[0] # Check the dimensions if A_num_columns != B_num_rows: print('dimension mismatch') # You should get an error as A_num_columns != B_num_rows C = np.matmul(A , B) print('C=AB= {}'.format(C)) # Instead of C=AB let's calculate C=BA C = np.matmul(B , A) print('C=BA= {}'.format(C))

The output will be:

Shape A is: (2, 3) Shape B is: (3, 2) C=AB= [[ 9 -6] [16 17]] C=BA= [[11 6 0] [-3 6 -4] [37 3 9]]

You can clearly check that . Properties of matrix multiplication are as below:

You can check all the above properties with the following Python code:

# Use Numpy package import numpy as np # Define three random 3x3 matrix using np.array # Ref: https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.randint.html A = np.random.randint(10, size=(3, 3)) B = np.random.randint(10, size=(3, 3)) C = np.random.randint(10, size=(3, 3)) # np.all() test whether all array elements are True. # More info: https://docs.scipy.org/doc/numpy/reference/generated/numpy.all.html checkProperty = np.all(np.matmul(A,np.matmul(B,C)) == np.matmul(np.matmul(A,B),C)) if checkProperty: print('Property A(BC) = (AB)C is confirmed!') checkProperty = np.all(np.matmul(A,B+C) == np.matmul(A,B) + np.matmul(A,C)) if checkProperty: print('Property A(B+C) = AB + AC is confirmed!') checkProperty = np.all(np.matmul(A,B).transpose() == np.matmul(B.transpose(),A.transpose())) if checkProperty: print('Property (AB)^T = B^T.A^T is confirmed!')

It is important to address the matrix power. Assuming you see something like . This simply means we multiply the matrix to its own for **three times**. It can be shown as below:

**BUT**, the * matrix should be an square matrix*, e.g., the number of its rows and columns

**MUST**be the same. Otherwise, the power operation would be invalid. Implement the following code to see what would happen:

# Calculate the matrix power for two square and non-square matrices. # Use Numpy package import numpy as np # A: 3x3 SQUARE matrix using np.array A = np.random.randint(10, size=(3, 3)) print('A has the shape of: {}'.format(A.shape)) # Calculate A^3 n=3 output = np.linalg.matrix_power(A, n) print('Output has the shape of: {}'.format(output.shape)) # A: 3x4 NON-SQUARE matrix using np.array B = np.random.randint(10, size=(3, 4)) print('B has the shape of: {}'.format(B.shape)) # Calculate A^3 n=3 output = np.linalg.matrix_power(B, n) print('Output has the shape of: {}'.format(output.shape))

### Vector-Matrix Multiplication

At first, I repeat the following note from one of the previous posts:

**NOTE:** We usually represent a vector using one column and multiple rows of elements. Henceforth, we can call a vector of size as a matrix. In general, we can informally say vectors are special kind of matrices which are 1-dimensional.

Similarly, to multiply the vector to matrix , the dimensions must match. Assume our matrix has the size of and we like to calculate the . What is the size of ? Can you guess? The only acceptable size for is . So we have the following operation:

You can implement a working example as below:

# Use Numpy package import numpy as np # A: 3x4 matrix using np.array A = np.random.randint(10, size=(3, 4)) # B: A vector of size 4 v = np.random.randint(10, size=(4,)) print('Shape A is: {}'.format(A.shape)) print('Shape v is: {}'.format(v.shape)) # Calculate the number of colums in A and number of rows in B A_num_columns = A.shape[1] v_num_rows = v.shape[0] # Check the dimensions if A_num_columns != v_num_rows: print('dimension mismatch') # You should get an error as A_num_columns != B_num_columns w = np.matmul(A , v) print('w=Av=', w) print('The shape of w is:',w.shape)

### Matrix Inverse

The inverse of matrix exists and is shown with when both of the following conditions hold:

- is a
**square matrix** - Multiplying with its inverse results in identity matrix:

If the matrix is invertible, it is called ** non-singular.** Otherwise, it is called non-invertible or

**. The concept of a matrix being singular is a relitively more advance concept which we will explain in future posts.**

*singular*You can practice the calculation of a matrix inverse using the following code:

# Calculate the matrix power for two square and non-square matrices. # Use Numpy package import numpy as np # A: 3x3 SQUARE matrix using np.array A = np.random.randint(10, size=(3, 3)) print('A has the shape of: {}'.format(A.shape)) # Caculate matrix inverse ainv = np.linalg.inv(A) print('The inverse of A is:',ainv) # Check to see if multiplication of A and A^{-1} equals to identiy matrix I if np.allclose(np.dot(A, ainv), np.eye(A.shape[0])): print('A is invertible!')

The following properties apply for ** matrix inversion operation**:

You can check the aformentioned properties using the following Python script:

# Calculate the matrix power for two square and non-square matrices. # Use Numpy package import numpy as np # Define two 3x3 SQUARE matrix using np.array A = np.random.randint(10, size=(3, 3)) B = np.random.randint(10, size=(3, 3)) # np.round(A,2): rounds the array A to two decimal points # Check property (AB)^{-1} = B^{-1}.A^{-1} left_side = np.linalg.inv(np.matmul(A,B)) right_side = np.matmul(np.linalg.inv(B),np.linalg.inv(A)) checkProperty = np.all(np.round(left_side, 2) == np.round(right_side, 2)) if checkProperty: print('Property (AB)^{-1} = B^{-1}.A^{-1} is confirmed!') # Check property (A^{T})^{-1} = (A^{-1})^{T} left_side = np.linalg.inv(np.transpose(A)) right_side = np.transpose(np.linalg.inv(A)) checkProperty = np.all(np.round(left_side, 2) == np.round(right_side, 2)) if checkProperty: print('Property (A^{T})^{-1} = (A^{-1})^{T} is confirmed!') # np.dot is a very important function. # Ref: https://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html # Check property (alpha.A)^{-1} = (1/alpha)A^{-1} alpha = float(3) # Any scalar (make sure to define a flaot) left_side = np.linalg.inv(np.dot(alpha,A)) right_side = np.dot(1/alpha,np.linalg.inv(A)) checkProperty = np.all(np.round(left_side, 2) == np.round(right_side, 2)) if checkProperty: print('Property (alpha.A)^{-1} = (1/alpha)A^{-1} is confirmed!')

### Matrix Trace

Assume we have a square matrix. Trace of a matrix is an operator that is the sum of its diagonal elements as below:

Working with the trace of a matrix, instead of the matrix itself, is easier for a lot of matrix calculations and mathematical proofs.

We have the following properties for the matrix trace operator:

- if both and exists and are valid.

### Special Matrices

There are special kinds of matrices that you may hear their names every day. I wanted to briefly describe some of them here.

**Symmetric matrix:** A symmetric matrix, equals its transpose as below:

An example would be the following matrix:

**Diagonal matrix:** A diagonal matrix only have non-zero elements on its main diagonal. Everywhere else in the matrix, we have zero elements.

In another word:

We can have the following example:

**NOTE: **A diagonal matrix, does not have to be a square matrix. The main diagonal of a metrix is formed with the elements when .

**Orthogonal matrix:** An orthogonal matrix is **a square matrix** which has the following characteristics:

Noted that if we multiply with and it results in the identity matrix, this implies that .

### Conclusion

In this tutorial, I explained the important matrix operations that are commonly used in Machine Learning. To understand the definitions better, you can refer to a previously published article, titled *Basic Linear Algebra Definitions that You Hear Every Day*. First I explain the specific operations. In addition, I showed you how to code it in Python. In conclusion, you made a sense of the practical implementation of the operations in Python in addition to their theoretical interpretation. Hopefully, you found them helpful to gain a better understanding of the concept.

P.S. Please share with me your thoughts by commenting below. I might be wrong in what I say, and I love to know when I am wrong. Furthermore, your questions might be my questions, as well. It’s always good to become better even if being the best is impossible in our belief system. So let’s help each other to become better.

## Leave a Reply