You may hear data augmentation everywhere in Machine Learning. Deep Learning is a great machine learning approach, using neural networks, that manage to operate well when trained on vast amounts of data. However, it requires a lot of data. What if we don’t have enough data and cannot obtain more? Data augmentation is there to help us as a powerful method that artificially increases the amount and diversity of data. In this tutorial, we focus on the following:

  • What are conventional data augmentation techniques in the image domain
  • How to implement them in TensorFlow

It is worth noting that, here, you become familiar with different techniques. Using them in training a neural network needs more tweaking, and in future tutorials, we address them as well.

How does it work?

Data augmentation is employed to show the different model variations of data that can be used in training and are consistent with the task of interest. Assume the example of image classification. Take a look at the images below:

data augmentation

The images are extracted from the textures in the colorectal cancer histology dataset. The task is to classify the textures in colorectal cancer histology. Intuitively, you can say the noisy image is still good to use for the specific task, although it is not as clear as the original image! That brings one of the most used approaches for augmentation: adding random noise. Let’s summarize as below:

  • Adding random noise can create new artificial samples that are still might be suitable for training a model (if not too noisy!).
  • A noisy image is a variant of an image.
  • We can simply change the noise power and create different artificial images leading to having the potential of generating an infinite amount of new data (think why?)!!!

Subscribe to Download the Full Source Code!

Common Data Augmentation Techniques

In this section, I am going to briefly address some of the most common data augmentation techniques utilized in the image domain. Let’s use TensorFlow for this aim.

Dataset

We use textures in colorectal cancer histology dataset. Each image is of size 150 x 150 x 3 RGB from 8 different classes, and there are 5000 images. This dataset is a part of TensorFlow datasets, which are a collection of ready-to-use datasets. First, let’s prepare the environment for TensorFlow:

import tensorflow as tf
import tensorflow_datasets as tfds # Import TensorFlow datasets
import urllib
import tensorflow_docs.plots
import tensorflow_datasets as tfds
import matplotlib.pyplot as plt
import numpy as np

# Necessary for dealing with https urls
import ssl
ssl._create_default_https_context = ssl._create_unverified_context

Not all of the above-loaded libraries will be used. BUT, I just loaded them to show you the most commonly used libraries when working with TensorFlow. Now we are going to load our dataset with tdfs.load (read more) command with some common arguments as below:

  • split=: Pick the predefined split to read (see TensorFlow API guide).
  • shuffle_files=: Shuffle the files in each epoch if True. This creates different batches for each epoch of training.
  • data_dir=: Location of saving data ( default: ~/tensorflow_datasets/)
  • with_info=: Returns the tfds.core.DatasetInfo containing dataset metadata
  • download=: If True, it downloads the dataset. Once we downloaded that, for future calls, we set this to False as a redownload is not necessary.

Use the below code:

# We read only the first 10 training samples
ds, ds_info = tfds.load('colorectal_histology', split='train', shuffle_files=True, with_info=True, download=False)
assert isinstance(ds, tf.data.Dataset)
print(ds_info)

You should get the following output which shows some infomation about the data:

tfds.core.DatasetInfo(
    name='colorectal_histology',
    version=2.0.0,
    description='Classification of textures in colorectal cancer histology. Each example is a 150 x 150 x 3 RGB image of one of 8 classes.',
    homepage='https://zenodo.org/record/53169#.XGZemKwzbmG',
    features=FeaturesDict({
        'filename': Text(shape=(), dtype=tf.string),
        'image': Image(shape=(150, 150, 3), dtype=tf.uint8),
        'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=8),
    }),
    total_num_examples=5000,
    splits={
        'train': 5000,
    },
    supervised_keys=('image', 'label'),
    citation="""@article{kather2016multi,
      title={Multi-class texture analysis in colorectal cancer histology},
      author={Kather, Jakob Nikolas and Weis, Cleo-Aron and Bianconi, Francesco and Melchers, Susanne M and Schad, Lothar R and Gaiser, Timo and Marx, Alexander and Z{"o}llner, Frank Gerrit},
      journal={Scientific reports},
      volume={6},
      pages={27988},
      year={2016},
      publisher={Nature Publishing Group}
    }""",
    redistribution_info=,
)

Now let’s visualize some examples with tfds.show_examples as a built-in method :

# Visualizing images
fig = tfds.show_examples(ds_info, ds)
data augmentation

Our dataset is stored in the ds object. Let’s play around with it and extract one example image and it’s associated label:

# Reading all images (remove break point to read all)
for example in tfds.as_numpy(ds):
  image, label = example['image'], example['label']
  break

# take one sample from data
one_sample = ds.take(1)
one_sample = list(one_sample.as_numpy_iterator())
image = one_sample[0]['image']
label = one_sample[0]['label']
print(image.shape,label.shape)

Perhaps above code needs more explanation:

  • Lines 2-4: tfds.as_numpy(ds) turns the data into iterable list of NumPy arrays. Each example has its associated image and label. The break point is there to only prevent going through the whole data as we do not need it. You can comment it thought!
  • Line 7: With ds.take(1), we only take one sample from the data.
  • Line 8: list(one_sample.as_numpy_iterator()) tranform the extracted samples from the previous step to a list of examples.
  • Lines 9-10: Extract the first element of the list (since took only one sample, we have only one element in the list), and extract the associated image and labels.

Let’s define a helper function that visualizes the original and augmented images side by side:

# Side by side visualization
def visualize(im, imAgmented, operation):
  fig = plt.figure()
  plt.subplot(1,2,1)
  plt.title('Original image')
  plt.imshow(im)

  plt.subplot(1,2,2)
  plt.title(operation)
  plt.imshow(imAgmented)

Now we are ready to investigate the augmentation techniques. See the visualized examples and think why those image variants are good candidates.

Data Augmentation Techniques

Additive Noise

As of now, noise adding is one of the most common data augmentation techniques justified above. Here, we directly investigate the implementation:

# Adding Gaussian noise to image
common_type = tf.float32 # Make noise and image of the same type
gnoise = tf.random.normal(shape=tf.shape(image), mean=0.0, stddev=0.1, dtype=common_type)
image_type_converted = tf.image.convert_image_dtype(image, dtype=common_type, saturate=False)
noisy_image = tf.add(image_type_converted, gnoise)
visualize(image_type_converted, noisy_image, 'noisyimage')
data augmentation

Brightness Adjustment

The first technique is to adjust the image brightness. By changing the brightness we have a new image that maintains the majority of the original image characteristics.

# Adjusting brighness
bright = tf.image.adjust_brightness(image, 0.2)
visualize(image, bright, 'brightened image')
data augmentation

Image Flipping

The next technique is to flip images from left to right!

# Flip image
flipped = tf.image.flip_left_right(image)
visualize(image, flipped, 'flipped image')
data augmentation

Quality Adjustment

One of the most common data augmentation techniques is to adjust the quality of an image. However, there is a caveat here: If we reduce the quality too much, then the resulted image may not be a desirable candidate for training our model anymore!

adjusted = tf.image.adjust_jpeg_quality(image, jpeg_quality=20)
visualize(image, adjusted, 'quality adjusted image')
data augmentation

Image Cropping

Random cropping of the image areas is of the most common data augmentation techniques. Assume a scenario that we have an image of a cat, and the task is classification. Let’s say the cat occupies a portion of the picture. Then, any random cropped area of the image that contains the cat is a good candidate for training our model. Why? It’s simple. The most important characteristic of an image has a cat inside it! Who cares about the size?

# Randon cropping of the image (the cropping area is picked at random)
crop_to_original_ratio = 0.5 # The scale of the cropped area to the original image
new_size = int(crop_to_original_ratio * image.shape[0])
cropped = tf.image.random_crop(image, size=[new_size,new_size,3])
visualize(image, cropped, 'randomly cropped image')
data augmentation

Center cropping is another cropping technique, also of frequent use. Especially, we only desire to investigate the center of our images, and a lot of areas in the picture give redundant information!

# Center cropping of the image (the cropping area is at the center)
central_fraction = 0.6 # The scale of the cropped area to the original image
center_cropped = tf.image.central_crop(image, central_fraction=central_fraction)
visualize(image, center_cropped, 'centrally cropped image')
data augmentation

Conclusion

In this tutorial, we investigate some of the most common data augmentation techniques in the image domain. The implementations were in TensorFlow, which is the most popular Deep Learning library as of now. Although we addressed some of the most commonly used techniques, the story does not end here! These techniques are only applicable in the image domain. Furthermore, those are known as traditional techniques, and there are many new approaches there for you to explore. Feel free to comment below if you have any questions.

Leave a Reply

avatar
  Subscribe  
Notify of
Scroll to Top

FREE EBOOK

TensorFlow Roadmap - A Comprehensive Resource Guide

Tweet
Share
Pin
Share