In this article, I will explain the vector and matrix norms. You will realize why they are essential, and what is their interpretation. Furthermore, you will learn how to implement them. I start with the definitions and ends it with the matrix norms.
“What is the vector norm, and why on earth do I need to know that?” Let’s assume you start with asking that question by seeing the glass half empty instead of half-full. So I am here to convince you. If this introduction could not convince you, then simply do not proceed as it would be a waste of time!
In Machine Learning, we are dealing with evaluations all the time. It is needless to say it usually involves Linear Algebra, vectors, and matrices. In evaluating an element, such as loss functions, often, you need to summarize all you need in one number. Well, you use mean, standard deviation, etc. BUT, how do you deal with vectors? How would you address the importance of a vector in terms of its elements? How would you evaluate the vector elements in terms of their contributions to the outcome? We need some metrics to get a sense of the magnitude of a vector to see how much it affects the algorithm optimization, evaluation performance, etc. Here are the norms that come to play.
By the end of this article, you should know:
- What is the definition of the norm?
- The norms’ properties
- Mostly used vector norms
- Definition of the matrix norm
- How to implement them in Python?
Before You Move On
You may find the following resources helpful to better understand the concept of this article:
- Python Tutorials – A FREE Video Course: You will become familiar with Python and its syntax.
- Numpy – An Introduction to a Great Package for Linear Algebra: Numpy is one of the best scientific computing packages for Linear Algebra! You always need it for Machine Learning as you always need Linear Algebra for Machine Learning!
- The Remarkable Importance of Linear Algebra in Machine Learning: This article talks about why you should care about Linear Algebra if you want to master Machine Learning.
- Basic Linear Algebra Definitions that You Hear Every Day: Covers the primary and most frequently used Linear Algebra definitions in Machine Learning.
What is a Vector Norm?
We usually use the norms for vectors and rarely for matrices. At first, let’s define what the norm of a vector is? A norm can be described as below:
- A function that operates on a vector (matrix) and returns a scalar element.
- A norm is denoted by in which shows the order of the norm and .
- The order a vector (matrix) is always a non-negative value.
- The intuition behind the norm is to measure a kind of distance.
A norm is mathematically defined as below:
The sign is an operation that outputs the absolute value of its argument. The example of which is and . You can implement by the following Python code:
# Import Numpy package and the norm function import numpy as np from numpy.linalg import norm # Define a vector v = np.array([2,3,1,0]) # Take the q-norm which q=2 q = 2 v_norm = norm(v, ord=q) # Print values print('The vector: ', v) print('The vector norm: ', v_norm)
you should get the following output:
The vector: [2 3 1 0] The vector norm: 3.7416573867739413
The Norm Function Properties
It is crucial to know the norms properties as we may need them in mathematical computation, especially mathematical proofs. A norm function has the following properties:
- If the norm is zero, then the vector is all zero as well:
- . This is called triangle inequality.
Proving the Properties (Advanced)
Properties (1) is easy to prove. As the norm add the absolute values of vector elements to any power, if the norm is zero, the only possible answer is that all vector elements must be zero.
We can prove property (3) as below:
We can mathematically prove property (2) as well. BUT, let’s do it in an empirical way. Let’s define our experiment as below:
- We randomly generate two vectors and
- We check the property that should be true.
- Repeat the experiment for E=100 times.
This is NOT scientific proof. However, if , then we can say we proved it. Right?
Let’s do it in Python:
# Import Numpy package and the norm function import numpy as np from numpy.linalg import norm # Repeat experiments E = 100 # Numper of experiments q = 2 # Order of the norm for i in range(E): # Define two random vector of size (1,5). Obviously v does not equal w!! v = np.random.rand(1,5) w = np.random.rand(1,5) propertyCheck = norm(v+w, ord=q) <= norm(v, ord=q) + norm(w, ord=q) if propertyCheck == False: print('Property is NOT correct')
So if the property (2) holds for all experiments, we expect the above Python code returns NOTHING. Think why?
Mostly Used Norms
In the previous section, I described what is the norm in general, and we implement it in Python. Here, I would like to discuss the norms that are mostly used in Machine Learning.
The norm is technically the summation over the absolute values of a vector. The simple mathematical formulation is as below:
In Machine Learning, we usually use norm when the sparsity of a vector matters, i.e., when the essential factor is the non-zero elements of a matrix. BUT why? The simply target the non-zero elements by adding them up.
norm is also called the Euclidean norm which is the Euclidean distance of a vector to zero.
We can simply calculate it as below:
The is commonly used in Machine Learning due to being differentiable, which is crucial for optimization purposes.
Let’s calculate norm of a random vector with Python using two approaches. Both should lead to the same results:
# Import Numpy package and the norm function import numpy as np from numpy.linalg import norm # Defining a random vector v = np.random.rand(1,5) # Calculate L-2 norm sum_square = 0 for i in range(v.shape): # Define two random vector of size (1,5). Obviously v does not equal w!! sum_square += np.square(v[0,i]) L2_norm_approach_1 = np.sqrt(sum_square) # Calculate L-2 norm using numpy L2_norm_approach_2 = norm(v, ord=2) print('L2_norm: ', L2_norm_approach_1) print('L2_norm with numpy:', L2_norm_approach_2)
You should get the same results!
Well, you may not see this norm quite often. However, it is a kind of definition that you should be familiar with. The max norm is denoted with and the mathematical formulation is as below:
It simply returns the maximum absolute value in the vector elements.
Norm of a Matrix
For calculating the norm of a matrix, we have the unusual definition of Frobenius norm which is very similar to norm of a vector and is as below:
In this article, you learned what the norms are and how to implement them. I wish if I could tell you this is the end and that’s all you need! BUT, no! Learning is an active process. You will see many many tutorials and concepts regarding the vector norms. Here, I tried to address what I believed you would see more frequently. There is always MORE! You can start with what I just provided. Make sure to play with Python codes. I assure you it will help you to gain a better understanding. I hope you find it useful. Please don’t forget to share your thoughts with me.