This article is dedicated to Numpy, one of the most important scientific computing libraries for Linear Algebra and henceforth, Machine Learning!
In almost any Python program code in Machine Learning, you see “Numpy” library being used! Why? Doing Machine Learning is impossible without Linear Algebra and Linear Algebra is formed by vectors, matrices, etc. Well, Numpy is one of the best scientific computing packages for Linear Algebra! So that was the short answer to the “why” question earlier!

You can conduct a simple 30 minutes research to find out if I am right or wrong! Search through the web and look for Machine Learning projects and their available source codes. Check to see how many of them use Numpy. Check Python codes. See how many times you see the expression “import numpy” or “import numpy as np” on top of the code. Do it! It definitely motivates you unless you already are aware of the importance of Numpy.
In this article, I am going to talk about Numpy, one of the most important scientific computing libraries. Here are what you learn in this tutorial:
- What is Numpy?
- How we can define arrays in Numpy?
- What are the basic operations?
- How to leverage Numpy for Machine Learning?
- What are the most used tips and tricks in using Numpy?
Before You Move On
You may find the following resources helpful to better understand the concept of this article:
- Python Tutorials – A FREE Video Course: You will become familiar with Python and its syntax.
- Basic Linear Algebra Definitions that You Hear Every Day: Covers the primary and most frequently used Linear Algebra definitions in Machine Learning.
Data Types
Here, I describe some of the most important Numpy numeric data types that you frequently encounter. Numpy supports numerous data types, perhaps even more than Python itself!
Basic Data Types
The most basic data types are as follows: integer (), float (
), complex (
), and boolean (
). You can use Numpy to convert Python elements in many different ways such as changing an array type to another specific type. Let’s start with the following examples:
# Import Numpy library import numpy as np # Define a number a = 10 print('Type, before converting: ', type(a)) # Change the type to float with Numpy b = np.float64(a) print('Type, after converting: ', type(b)) # Change the type to float with Numpy c = float(a) print('Type, after converting: ', type(c))
Type, before converting: <class 'int'> Type, after converting: <class 'numpy.float64'> Type, after converting: <class 'float'>
Question: What is the difference between using np.float64
and the Python float
built-in function?
Type Conversion
Now, let’s turn a list to a Numpy array. For now, let’s just focus on the data types and conversion, later in this article, I will explain how to define Numpy arrays in details.
# Import Numpy library import numpy as np # Define a Python list a = [1, 2, 1.3] print('Type "a": ', type(a)) # Turn Python list into a Numpy array of type np.int32 (Integer (-2147483648 to 2147483647)) b = np.int32(a) print('Array "b": ', b) print('Type array "b": ', type(b)) print('Array "b" data type: ', b.dtype) # Turn Python list into a Numpy array of type np.float32 (same as Python float) c = np.float32(a) print('Array "c": ', c) print('Type array "c": ', type(c)) print('Array "c" data type: ', c.dtype)
Type "a": <class 'list'> Array "b": [1 2 1] Type array "b": <class 'numpy.ndarray'> Array "b" data type: int32 Array "c": [1. 2. 1.3] Type array "c": <class 'numpy.ndarray'> Array "c" data type: float32
We used the .dtype Numpy method to realize what is the data type inside the array. The recommended way to change the type of a Numpy array is the usage of .astype()
method. Take a look at the following example:
# Import Numpy library import numpy as np # Define a Python list a = [1, 2, 4] print('Type "a": ', type(a)) # Turn Python list into a Numpy array of type np.int32 (Integer (-2147483648 to 2147483647)) b = np.int32(a) print('Array "b": ', b) print('Type array "b": ', type(b)) # Turn Python list into a Numpy array of type np.float32 (same as Python float) c = b.astype(np.float32) print('Array "c": ', c) print('Type array "c": ', type(c))
Type "a": <class 'list'> Array "b": [1 2 4] Type array "b": <class 'numpy.ndarray'> Array "c": [1. 2. 4.] Type array "c": <class 'numpy.ndarray'>
Defining a Numpy Array
An array is basically a one- or multi-dimensional grid of values. In a Numpy array, in particular, all values are from the same type (integer, float). How we are going to define a Numpy array? For a Numpy array, we have the following definitions:
- Rank: The number of dimensions an array has.
- Shape: A tuple that indicates the number of elements in each dimension. Ex: The shape of an array being as (2,4,10) indicates that we have a three-dimensional array which has 2,4, and 10 elements in the first, second, and third dimension, respectively.
Create an array from a list
Now let’s get started with Python. We start with the most common approach. Let’s define create a Numpy array from a list:
# Import Numpy library import numpy as np # Define a Python list mylist = [1, 2, 4, 8] # Create a Numpy array from the list numpy_array = np.array(mylist) print('Array: ', numpy_array)
Array: [1 2 4 8]

np.array
function to transform a list into a Numpy array.What if we do not define a list and just input the numbers as below:
# Import Numpy library import numpy as np # Naively input the numbers numpy_array = np.array(1,2,4,8) print('Array: ', numpy_array)
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-3-6d0f836d151e> in <module>() 2 3 # Naively input the numbers ----> 4 numpy_array = np.array(1,2,4,8) 5 print('Array: ', numpy_array) ValueError: only 2 non-keyword arguments accepted
As you can see above, Python complains!!
Now, let’s define a two-dimensional array:
# Import Numpy library import numpy as np # Naively input the numbers row1 = [2,4,6,8] row2 = [1,3,5,7] numpy_array = np.array([row1,row2]) print('Array: ', numpy_array) # Get the shape print('Shape: ', numpy_array.shape)
Array: [[2 4 6 8] [1 3 5 7]] Shape: (2, 4)

Let’s take a look at the above code once again. We defined a matrix. The argument inside np.array is a list that each of its elements is another list (see figure above)! The inside lists denoted as row1
and row2
forms the rows of the matrix and MUST have the same size! Think why? That was a simple example to showcase how we can create arrays. I used .shape
method to return the Numpy array shape. The output above shows we have a matrix with two rows and four columns.
Special functions
We can create Numpy arrays using some special Numpy functions. I used a couple of them below for your reference:
# Import Numpy library import numpy as np ### Defining arrays using special functions ### # Arguments: # shape: The shape of the numpy array # dtype: Specifying the data type (not required) # Defining an all-zero array zeroArray = np.zeros(shape=(3,5), dtype=np.int16 ) print("zeroArray: ", zeroArray) # Defining an all-one array onesArray = np.ones(shape=(3,5), dtype=np.float32 ) print("onesArray: ", onesArray) # Defining an array filled with one specific elements fullArray = np.full(shape=(3,5), fill_value=4.2, dtype=np.float64 ) print("fullArray: ", fullArray)
zeroArray: [[0 0 0 0 0] [0 0 0 0 0] [0 0 0 0 0]] onesArray: [[1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.]] fullArray: [[4.2 4.2 4.2 4.2 4.2] [4.2 4.2 4.2 4.2 4.2] [4.2 4.2 4.2 4.2 4.2]]

One of the most important special functions is np.arange which is similar to Python range built-in function. An example of using np.arrage is as below:
# Import Numpy library import numpy as np # Define a Numpy array using np.arrage # np.arrage defines an interval of numbers # Arguments: # start: Starting of the interval. # stop: Ending of the interval. # step: Step size. # NOTE: The interval includes "start" number but does NOT include "stop" number. arr = np.arange(start=3,stop=10,step=2) print("Array: ", arr)
Array: [3 5 7 9]

Universal functions
NumPy, for performing element-wise operations, provides some universal functions. Take a look at some of the universal functions demonstrated below:
# Import Numpy library import numpy as np v = np.array([1, 3, 5]) w = np.array([2, 4, 6]) # Exponential opration print('e^v= ', np.exp(v)) # The squere root of an array print('v^{1/2}= ', np.sqrt(v)) # Adding two vectors print('v+w= ', np.add(v, w)) # Sin and Cos of an array print('Sin(v)= ', np.sin(v)) print('Cos(v)= ', np.cos(v))
Random Array
Sometimes, it is desired to define an array with random numbers. There are many ways to this task depends on the kind of random numbers we want to use. For example, do we want to generate random integers, float, etc? Take a look at the below approaches:
# Import Numpy library import numpy as np ### Defining arrays with random elements ### # Arguments: # size: The size of the numpy array # Defining a random array randArray = np.random.random(size=(2,3)) print("randArray: ", randArray) # Defining a random array with integer elements # high: The highest element that is allowed to be generated is ``high-1`` # low: The lowest integer that is allowed to be generated randintArray = np.random.randint(low=1, high=5, size=(2,3)) print("randintArray: ", randintArray)
randArray: [[0.65055144 0.12510875 0.09906024] [0.47333278 0.67082837 0.7673982 ]] randintArray: [[4 4 1] [1 2 2]]
CAVEAT: As we generated a random vector, if you use the above Python code, you definitely will NOT get the results I reported above! It was obvious, though. Right?
Basic Operations
Basic arithmetic operation
Let’s first cover the basic arithmetic operations with some examples:
# Import Numpy library import numpy as np # Create two arrays a = np.array([[1,5],[0,8]], dtype=np.float32) b = np.array([[2,1],[5,2]], dtype=np.float32) print('a= ', a) print('b= ', b) # Elementwise adding # a + b and np.add(a, b) are the same. # Check to see if a + b and np.add(a, b) are the same using ``assert``. # .all() method check if all elements of a matrix is True. print('a + b= ', a + b) assert((a + b == np.add(a, b)).all()) # Elementwise subtraction print('a - b= ', a - b) assert((a - b == np.subtract(a, b)).all()) # Elementwise multiplication print('a * b= ', a * b) assert((a * b == np.multiply(a, b)).all()) # Elementwise division print('a / b= ',a / b) assert((a / b == np.divide(a, b)).all()) # Elementwise square print('a^2= ', np.square(a)) assert((a ** 2 == np.square(a)).all())
a= [[1. 5.] [0. 8.]] b= [[2. 1.] [5. 2.]] a + b= [[ 3. 6.] [ 5. 10.]] a - b= [[-1. 4.] [-5. 6.]] a * b= [[ 2. 5.] [ 0. 16.]] a / b= [[0.5 5. ] [0. 4. ]] a^2= [[ 1. 25.] [ 0. 64.]]

Define a Vector – Linear Algebra
Let’s talk about how we define a vector with Numpy. Assume we would like to define a column vector with has a size of . Take a careful look to the code below and the shape of the arrays:
# Import Numpy library import numpy as np # Rank-1 array v = np.array([0,8]) print('Shape: ', v.shape) # Rank-2 array (row vector) v = np.array([[0,8]]) print('Shape: ', v.shape) # Rank-2 array (column vector) v = np.array([[0],[8]]) print('Shape: ', v.shape)
In the above code, we defined the same arrays in terms of numeric values with different ranks and shape. At line 7, we defined a rank-1 array (has only one dimension). At 11, we defined a rank-2 array which is a row vector (1 row and multiple columns). For defining vectors, the preference is how we did at line 15 which results in a rank-2 array and a column vector (multiple rows and 1 column). The output of the above code is as below:
Shape: (2,) Shape: (1, 2) Shape: (2, 1)
Remember we do NOT usually need to define vectors as we did in lines 11 or 15. That approach seemed to be a little bit complicated using all those sorts of nested Python lists! Now let’s do it the easy way:
import numpy as np # Rank-1 array v = np.array([0,8]) print('Shape: ', v.shape) # Rank-1 array (row vector) row_v = v.reshape(1,-1) print('Shape: ', row_v.shape) # Rank-1 array (column vector) column_v = v.reshape(-1,1) print('Shape: ', column_v.shape)
What I did above? (1) I used “-1” as it indicates all rows (columns). (2) I used the Numpy “reshape” method which simply changes the shape of the array to the desired shape (details later in this tutorial). (3) I used “1” indexing which indicates one!
Let me explain the line 8 of the above code for further illustration. (1) “-1” is the total columns which are the total elements of the vector , equals 2. (2) Numpy “reshape” method changes the
shape to (1,2) which means the new vector (
) has 2 columns and only one row! It is worth to emphasize
is a row vector as it only has one row.
NOTE: In simple words, (1) (1,-1) means put only one row and place all elements in columns and (-1,1) means put only one column and place all elements in rows. Check the below figure.

Matrix/Vector Operations – Linear Algebra
This section is dedicated to what we may mostly use in Machine Learning. Operations on vectors and matrices. Let’s take a look:
# Import Numpy library import numpy as np # Create two vectors and two matrices v = np.array([0,8]).reshape(-1,1) u = np.array([1,4]).reshape(-1,1) A = np.array([[2,1],[5,2]]) B = np.array([[2,1],[5,2]]) # Dot porduct of two vectors with two approaches print('v.u = ', v.dot(u.transpose())) print('v.u = ', np.dot(v, u.transpose())) # Porduct of a vector with a matrix print('A.v = ', A.dot(v)) print('A.v = ', np.dot(A, v)) # Matrix product with three approaches print('A.B = ', A.dot(B)) print('A.B = ', np.dot(A, B)) print('A.B = ', np.matmul(A,B))
Let’s do a practice. Run the above code and answer the following questions:
- What is the shape and rank of
and
?
- In lines 11 and 12, did we have to use “.transpose()”? Why?
- Instead of calculating ‘v.u’ how would you calculate ‘u.v’?
- Take a look at lines 15 and 16. Instead of ‘A.v’, can we calculate ‘v.A’?

I have used np.matmul, in one of the previous posts. Now, let’s discussed the frequently used operations that we use in Machine Learning: Sum and mean over a matrix, or along with a specific dimension:
# Import Numpy library import numpy as np # Create a matrix A = np.array([[2,1,3,4],[5,2,9,4]]) print('A=', A) # Sum and mean over the matrix print('sum(A) = ', np.sum(A)) print('mean(A) ', np.mean(A)) # Sum and mean over axiz zero (rows) print('Sum over rows = ', np.sum(A, axis=0)) print('Mean over rows = ', np.mean(A, axis=0)) # Sum and mean over axiz one (colums) print('Sum over columns = ', np.sum(A, axis=1)) print('Mean over columns = ', np.mean(A, axis=1))
A= [[2 1 3 4] [5 2 9 4]] sum(A) = 30 mean(A) 3.75 Sum over rows = [ 7 3 12 8] Mean over rows = [3.5 1.5 6. 4. ] Sum over columns = [10 20] Mean over columns = [2.5 5. ]
NOTE: When we take the sum/mean over a specific axis, the result is an array in which that dimension is squeezed to one dimension. The example above shows if we take the sum/mean over the dimension zero (one), the resulting array has only one row (column), and the number of columns (rows) is equal to the number of columns (rows) in the main matrix .
Array Manipulation
After becoming familiar with Numpy arrays, now it’s time to learn how to play with arrays.
Indexing
First, let’s define and slice an array.
# Import Numpy library import numpy as np # Create a matrix A = np.array([[2,1,3,4],[5,2,9,4],[5,2,10,1],[2,2,11,-1]]) print('A=\n', A) # Extract the first two rows # Remember 0:2 in indexing means {0,1} and does NOT include 2! # Using : merely, means ALL! print('The first two rows= \n', A[0:2,:]) # Extract the first three rows and the last two columns # -2: means the second to the last to the end! # 0:3 mean {0,1,2} print('The first three rows and last two columns= \n', A[0:3,-2:]) # Let's point to one element # The second row (index 1) and third column (index 2) # Remember Python indexing starts from zero!! print('The second row and third column= \n', A[1,2])
A= [[ 2 1 3 4] [ 5 2 9 4] [ 5 2 10 1] [ 2 2 11 -1]] The first two rows= [[2 1 3 4] [5 2 9 4]] The first three rows and last two columns= [[ 3 4] [ 9 4] [10 1]] The second row and third column= 9

Let’s review the code above. At line 11, I used sliced indexing by selecting from a range of indices. At line 21, I only used integer indices. We can simply combine both, but there might be a difference in the output matrix ranking. Check the example below:
# Import Numpy library import numpy as np # Create a matrix A = np.array([[2,1,3,4],[5,2,9,4],[5,2,10,1],[2,2,11,-1]]) print('A=\n', A) # Extract the first row and all colums using two approachs print('The first row with slice indexing= \n', A[0:1,:]) # Slice indexing print('The first row with integer indexing= \n', A[0,:]) # Integer indexing print('The first row shape slice indexing= \n', A[0:1,:].shape) # Slice indexing print('The first row shape integer indexing= \n', A[0,:].shape) # Integer indexing
A= [[ 2 1 3 4] [ 5 2 9 4] [ 5 2 10 1] [ 2 2 11 -1]] The first row with slice indexing= [[2 1 3 4]] The first row with integer indexing= [2 1 3 4] The first row shape slice indexing= (1, 4) The first row shape integer indexing= (4,)
If you see the results, the shape of the matrix would be different. Basically, with slice indexing, we have a rank-2 matrix and with integer indexing, we will have a rank-1 matrix. Be careful about this difference when you are dealing with Numpy indexing.
Shaping
# Import Numpy library import numpy as np # Create a matrix A = np.array([[2,1,3,4],[5,2,9,4],[5,2,10,1]]) print('A=\n', A) print('Shape of A=\n', A.shape) # Reshape A to the new shape of (2,6) B = A.reshape(2,6) print("B: \n", B) # Reshape A to the new shape of (2,x) # If we use -1, the remaining dimension will be chosen automatically. C = A.reshape(4,-1) print("C: \n", C) # Flatten operation print("Flatten A: \n", A.ravel())
A= [[ 2 1 3 4] [ 5 2 9 4] [ 5 2 10 1]] Shape of A= (3, 4) B: [[ 2 1 3 4 5 2] [ 9 4 5 2 10 1]] C: [[ 2 1 3] [ 4 5 2] [ 9 4 5] [ 2 10 1]] Flatten A: [ 2 1 3 4 5 2 9 4 5 2 10 1]
The question is how reshaping operations work? Above we had the matrix of size
with 12 (
) total elements. When we use np.reshape, the default Numpy order is “C-style”, which is, the rightmost index “changes the fastest” for the processing operation. Let’s use the above example of using .ravel() to flatten the matrix: The first element is obviously
and the next one is
. The processing and creating the new array is as below when using .ravel():
Conclusion
In this tutorial, you learned how to use Numpy. You also realized how important it is for Machine Learning purposes. But clearly this tutorial is NOT a panacea (a cure for everything!) for all your Numpy needs nor it is flawless! You always need to explore more. You can learn more efficiently if you practice on your own. Use the above codes as a starter code and try to play around with them. Feel free to ask questions and share your comments as I am sure it can help you, myself and every other reader to learn more. This tutorial is subject to change and I would be happy to have your suggestions for doing so.