You may hear data augmentation everywhere in Machine Learning. Deep Learning is a great machine learning approach, using neural networks, that manage to operate well when trained on vast amounts of data. However, it requires a lot of data. What if we don’t have enough data and cannot obtain more? Data augmentation is there to help us as a powerful method that artificially increases the amount and diversity of data. In this tutorial, we focus on the following:
- What are conventional data augmentation techniques in the image domain
- How to implement them in TensorFlow
It is worth noting that, here, you become familiar with different techniques. Using them in training a neural network needs more tweaking, and in future tutorials, we address them as well.
How does it work?
Data augmentation is employed to show the different model variations of data that can be used in training and are consistent with the task of interest. Assume the example of image classification. Take a look at the images below:

The images are extracted from the textures in the colorectal cancer histology dataset. The task is to classify the textures in colorectal cancer histology. Intuitively, you can say the noisy image is still good to use for the specific task, although it is not as clear as the original image! That brings one of the most used approaches for augmentation: adding random noise. Let’s summarize as below:
- Adding random noise can create new artificial samples that are still might be suitable for training a model (if not too noisy!).
- A noisy image is a variant of an image.
- We can simply change the noise power and create different artificial images leading to having the potential of generating an infinite amount of new data (think why?)!!!
[thrive_lead_lock id=’4018′]
[/thrive_lead_lock]
Common Data Augmentation Techniques
In this section, I am going to briefly address some of the most common data augmentation techniques utilized in the image domain. Let’s use TensorFlow for this aim.
Dataset
We use textures in colorectal cancer histology dataset
. Each image is of size 150 x 150 x 3 RGB from 8 different classes, and there are 5000 images. This dataset is a part of TensorFlow datasets, which are a collection of ready-to-use datasets. First, let’s prepare the environment for TensorFlow:
import tensorflow as tf import tensorflow_datasets as tfds # Import TensorFlow datasets import urllib import tensorflow_docs.plots import tensorflow_datasets as tfds import matplotlib.pyplot as plt import numpy as np # Necessary for dealing with https urls import ssl ssl._create_default_https_context = ssl._create_unverified_context
Not all of the above-loaded libraries will be used. BUT, I just loaded them to show you the most commonly used libraries when working with TensorFlow. Now we are going to load our dataset with tdfs.load (read more) command with some common arguments as below:
- split=: Pick the predefined split to read (see TensorFlow API guide).
- shuffle_files=: Shuffle the files in each epoch if
True
. This creates different batches for each epoch of training. - data_dir=: Location of saving data ( default: ~/tensorflow_datasets/)
- with_info=: Returns the tfds.core.DatasetInfo containing dataset metadata
- download=: If
True
, it downloads the dataset. Once we downloaded that, for future calls, we set this toFalse
as a redownload is not necessary.
Use the below code:
# We read only the first 10 training samples ds, ds_info = tfds.load('colorectal_histology', split='train', shuffle_files=True, with_info=True, download=False) assert isinstance(ds, tf.data.Dataset) print(ds_info)
You should get the following output which shows some infomation about the data:
tfds.core.DatasetInfo( name='colorectal_histology', version=2.0.0, description='Classification of textures in colorectal cancer histology. Each example is a 150 x 150 x 3 RGB image of one of 8 classes.', homepage='https://zenodo.org/record/53169#.XGZemKwzbmG', features=FeaturesDict({ 'filename': Text(shape=(), dtype=tf.string), 'image': Image(shape=(150, 150, 3), dtype=tf.uint8), 'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=8), }), total_num_examples=5000, splits={ 'train': 5000, }, supervised_keys=('image', 'label'), citation="""@article{kather2016multi, title={Multi-class texture analysis in colorectal cancer histology}, author={Kather, Jakob Nikolas and Weis, Cleo-Aron and Bianconi, Francesco and Melchers, Susanne M and Schad, Lothar R and Gaiser, Timo and Marx, Alexander and Z{"o}llner, Frank Gerrit}, journal={Scientific reports}, volume={6}, pages={27988}, year={2016}, publisher={Nature Publishing Group} }""", redistribution_info=, )
Now let’s visualize some examples with tfds.show_examples as a built-in method :
# Visualizing images fig = tfds.show_examples(ds_info, ds)

Our dataset is stored in the ds object. Let’s play around with it and extract one example image and it’s associated label:
# Reading all images (remove break point to read all) for example in tfds.as_numpy(ds): image, label = example['image'], example['label'] break # take one sample from data one_sample = ds.take(1) one_sample = list(one_sample.as_numpy_iterator()) image = one_sample[0]['image'] label = one_sample[0]['label'] print(image.shape,label.shape)
Perhaps above code needs more explanation:
- Lines 2-4: tfds.as_numpy(ds) turns the data into iterable list of NumPy arrays. Each example has its associated image and label. The break point is there to only prevent going through the whole data as we do not need it. You can comment it thought!
- Line 7: With ds.take(1), we only take one sample from the data.
- Line 8: list(one_sample.as_numpy_iterator()) tranform the extracted samples from the previous step to a list of examples.
- Lines 9-10: Extract the first element of the list (since took only one sample, we have only one element in the list), and extract the associated image and labels.
Let’s define a helper function that visualizes the original and augmented images side by side:
# Side by side visualization def visualize(im, imAgmented, operation): fig = plt.figure() plt.subplot(1,2,1) plt.title('Original image') plt.imshow(im) plt.subplot(1,2,2) plt.title(operation) plt.imshow(imAgmented)
Now we are ready to investigate the augmentation techniques. See the visualized examples and think why those image variants are good candidates.
Data Augmentation Techniques
Additive Noise
As of now, noise adding is one of the most common data augmentation techniques justified above. Here, we directly investigate the implementation:
# Adding Gaussian noise to image common_type = tf.float32 # Make noise and image of the same type gnoise = tf.random.normal(shape=tf.shape(image), mean=0.0, stddev=0.1, dtype=common_type) image_type_converted = tf.image.convert_image_dtype(image, dtype=common_type, saturate=False) noisy_image = tf.add(image_type_converted, gnoise) visualize(image_type_converted, noisy_image, 'noisyimage')

Brightness Adjustment
The first technique is to adjust the image brightness. By changing the brightness we have a new image that maintains the majority of the original image characteristics.
# Adjusting brighness bright = tf.image.adjust_brightness(image, 0.2) visualize(image, bright, 'brightened image')

Image Flipping
The next technique is to flip images from left to right!
# Flip image flipped = tf.image.flip_left_right(image) visualize(image, flipped, 'flipped image')

Quality Adjustment
One of the most common data augmentation techniques is to adjust the quality of an image. However, there is a caveat here: If we reduce the quality too much, then the resulted image may not be a desirable candidate for training our model anymore!
adjusted = tf.image.adjust_jpeg_quality(image, jpeg_quality=20) visualize(image, adjusted, 'quality adjusted image')

Image Cropping
Random cropping of the image areas is of the most common data augmentation techniques. Assume a scenario that we have an image of a cat, and the task is classification. Let’s say the cat occupies a portion of the picture. Then, any random cropped area of the image that contains the cat is a good candidate for training our model. Why? It’s simple. The most important characteristic of an image has a cat inside it! Who cares about the size?
# Randon cropping of the image (the cropping area is picked at random) crop_to_original_ratio = 0.5 # The scale of the cropped area to the original image new_size = int(crop_to_original_ratio * image.shape[0]) cropped = tf.image.random_crop(image, size=[new_size,new_size,3]) visualize(image, cropped, 'randomly cropped image')

Center cropping is another cropping technique, also of frequent use. Especially, we only desire to investigate the center of our images, and a lot of areas in the picture give redundant information!
# Center cropping of the image (the cropping area is at the center) central_fraction = 0.6 # The scale of the cropped area to the original image center_cropped = tf.image.central_crop(image, central_fraction=central_fraction) visualize(image, center_cropped, 'centrally cropped image')

Conclusion
In this tutorial, we investigate some of the most common data augmentation techniques in the image domain. The implementations were in TensorFlow, which is the most popular Deep Learning library as of now. Although we addressed some of the most commonly used techniques, the story does not end here! These techniques are only applicable in the image domain. Furthermore, those are known as traditional techniques, and there are many new approaches there for you to explore. Feel free to comment below if you have any questions.