Image data augmentation

pytorch
data augmentation
Author

Youfeng Zhou

Published

November 10, 2022

This notebook will show the common methods used for preparing image data for vision models in PyTorch.

Load an original image

img = Image.open('images/cat.jpeg')
img

Convert the image to Tensor

toTensor = torchvision.transforms.ToTensor()
toTensor(img).shape
torch.Size([3, 1199, 1200])

From above, we know that the image size is 1199 * 1200.

Resize the image

Here we resize the image to size 224 \(\times\) 224.

resize = torchvision.transforms.Resize(224)
img_rs = resize(img)
img_rs

Flip an image

1. Flip horizontally

flip = torchvision.transforms.RandomHorizontalFlip(p=1.0)
flip(img_rs)

2. Flip vertically

flip = torchvision.transforms.RandomVerticalFlip(p=1.0)
flip(img_rs)

Change brightness, contrast, saturation and hue of an image

colorjitter = torchvision.transforms.ColorJitter(brightness=0.5, contrast=0.5, saturation=0.5, hue=0.3)
colorjitter(img_rs)

Turn an image grayscale

grayscale = torchvision.transforms.Grayscale()
grayscale(img_rs)

Crop an image

crop = torchvision.transforms.CenterCrop(128)
crop(img_rs)