Computer Vision & Image Processing¶
How does Computer Vision work¶
import numpy as np #Python library used for working with arrays
import matplotlib.pyplot as plt #Library for interactive vizualizations in python
from PIL import Image #Python Imaging Library
pic = Image.open('sample_data/codechamp.png')
pic
type(pic)
PIL.PngImagePlugin.PngImageFile
def __init__(fp=None, filename=None)
Base class for image file format handlers.
pic_arr = np.asarray(pic)
print(pic_arr)
type(pic_arr)
[[[ 14 14 14 255] [ 14 14 14 255] [ 14 14 14 255] ... [ 25 57 80 255] [ 25 57 80 255] [ 14 14 14 255]] [[ 14 14 14 255] [ 14 14 14 255] [ 14 14 14 255] ... [ 25 57 80 255] [ 25 57 80 255] [ 14 14 14 255]] [[ 14 14 14 255] [ 14 14 14 255] [ 14 14 14 255] ... [ 25 57 80 255] [ 25 57 80 255] [ 14 14 14 255]] ... [[ 14 14 14 255] [ 14 14 14 255] [ 14 14 14 255] ... [ 14 14 14 255] [ 14 14 14 255] [ 14 14 14 255]] [[ 14 14 14 255] [ 14 14 14 255] [ 14 14 14 255] ... [ 14 14 14 255] [ 14 14 14 255] [ 14 14 14 255]] [[ 14 14 14 255] [ 14 14 14 255] [ 14 14 14 255] ... [ 14 14 14 255] [ 14 14 14 255] [ 14 14 14 255]]]
numpy.ndarray
Dimensions of the picture
pic_arr.shape
(1276, 1272, 4)
Graph plotting of the picture
plt.imshow(pic_arr)
<matplotlib.image.AxesImage at 0x7fa031a5b430>
import cv2
img1 = cv2.imread('sample_data/dog_backpack.png')
img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2RGB)
img2 = cv2.imread('sample_data/do_not_copy.png')
img21= cv2.cvtColor(img2, cv2.COLOR_BGR2RGB)
plt.imshow(img1)
img1.shape
(901, 1198, 3)
plt.imshow(img2)
img2.shape
(1081, 1077, 3)
Resizing images
img1=cv2.resize(img1,(1200,1200))
img2=cv2.resize(img2,(1200,1200))
BLENDING IMAGES OF THE SAME SIZE¶
blended = cv2.addWeighted(src1=img1, alpha=0.9, src2=img2, beta=0.3, gamma=0)
plt.imshow(blended)
<matplotlib.image.AxesImage at 0x7fa031a3b5e0>
Blurring¶
Gaussian Blur: Applies a Gaussian filter for smoother blurring. Preserves edges better than averaging blur.
# Importing Image and ImageFilter module from PIL package
from PIL import Image, ImageFilter
# creating a image object
im1 = Image.open("sample_data/leave.jpg")
# applying the Gaussian Blur filter
im2 = im1.filter(ImageFilter.GaussianBlur(radius = 2))
plt.imshow(im2)
Computer Vision Challenges¶
Challenges to becoming a leading technology:
o Reasoning and analytical issues: To become a computer vision expert, you must have strong reasoning and analytical skills.
o Privacy and security: Vision-powered surveillance is also having various serious privacy issues for lots of countries. It restricts users from accessing unauthorized content. Further, various countries also avoid such face recognition and detection techniques for privacy and security reasons.
o Duplicate and false content: A data breach can lead to serious problems, such as creating duplicate images and videos over the internet.
DEEP LEARNING
¶
Dataset Details¶
Dogs and Cats Dataset¶
https://www.tensorflow.org/datasets/catalog/cats_vs_dogs
- The Dogs and Cats dataset contains around 23k images of dogs and cats.
- We will be downloading the dataset from the TensorFlow Datasets library.
- The dataset is already split into training and validation sets, so we don't need to split it manually.
- The dataset is also already preprocessed and all images are of the same size (256 x 256 x 3).
- For better understanding of the dataset, I would be downloading the dataset and then visualizing it.
!pip install tensorflow_datasets
Requirement already satisfied: tensorflow_datasets in /usr/local/lib/python3.10/dist-packages (4.9.4) Requirement already satisfied: absl-py in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (1.4.0) Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (8.1.7) Requirement already satisfied: dm-tree in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (0.1.8) Requirement already satisfied: etils[enp,epath,etree]>=0.9.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (1.7.0) Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (1.25.2) Requirement already satisfied: promise in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (2.3) Requirement already satisfied: protobuf>=3.20 in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (3.20.3) Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (5.9.5) Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (2.31.0) Requirement already satisfied: tensorflow-metadata in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (1.14.0) Requirement already satisfied: termcolor in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (2.4.0) Requirement already satisfied: toml in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (0.10.2) Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (4.66.2) Requirement already satisfied: wrapt in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (1.14.1) Requirement already satisfied: array-record>=0.5.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (0.5.0) Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from etils[enp,epath,etree]>=0.9.0->tensorflow_datasets) (2023.6.0) Requirement already satisfied: importlib_resources in /usr/local/lib/python3.10/dist-packages (from etils[enp,epath,etree]>=0.9.0->tensorflow_datasets) (6.1.1) Requirement already satisfied: typing_extensions in /usr/local/lib/python3.10/dist-packages (from etils[enp,epath,etree]>=0.9.0->tensorflow_datasets) (4.9.0) Requirement already satisfied: zipp in /usr/local/lib/python3.10/dist-packages (from etils[enp,epath,etree]>=0.9.0->tensorflow_datasets) (3.17.0) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->tensorflow_datasets) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->tensorflow_datasets) (3.6) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->tensorflow_datasets) (2.0.7) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->tensorflow_datasets) (2024.2.2) Requirement already satisfied: six in /usr/local/lib/python3.10/dist-packages (from promise->tensorflow_datasets) (1.16.0) Requirement already satisfied: googleapis-common-protos<2,>=1.52.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow-metadata->tensorflow_datasets) (1.62.0)
import tensorflow as tf
import tensorflow_datasets as tfds
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
dataset, info = tfds.load('cats_vs_dogs', with_info=True, as_supervised=True)
Downloading and preparing dataset 786.67 MiB (download: 786.67 MiB, generated: 1.04 GiB, total: 1.81 GiB) to /root/tensorflow_datasets/cats_vs_dogs/4.0.1...
Dl Completed...: 0 url [00:00, ? url/s]
Dl Size...: 0 MiB [00:00, ? MiB/s]
Generating splits...: 0%| | 0/1 [00:00<?, ? splits/s]
Generating train examples...: 0%| | 0/23262 [00:00<?, ? examples/s]
WARNING:absl:1738 images were corrupted and were skipped
Shuffling /root/tensorflow_datasets/cats_vs_dogs/4.0.1.incompleteTM33FF/cats_vs_dogs-train.tfrecord*...: 0%|…
Dataset cats_vs_dogs downloaded and prepared to /root/tensorflow_datasets/cats_vs_dogs/4.0.1. Subsequent calls will reuse this data.
class_names = info.features['label'].names
class_names
['cat', 'dog']
for i, example in enumerate(dataset['train']):
# example = (image, label)
image, label = example
save_dir = './cats_vs_dogs/train/{}'.format(class_names[label])
os.makedirs(save_dir, exist_ok=True)
filename = save_dir + "/" + "{}_{}.jpg".format(class_names[label], i)
tf.keras.preprocessing.image.save_img(filename, image.numpy())
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-4-3103315d82e1> in <cell line: 1>() 2 # example = (image, label) 3 image, label = example ----> 4 save_dir = './cats_vs_dogs/train/{}'.format(class_names[label]) 5 os.makedirs(save_dir, exist_ok=True) 6 NameError: name 'class_names' is not defined
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, BatchNormalization
from tensorflow.keras.models import Sequential
What is Overfitting?¶
Overfitting is a common problem in deep learning where the model becomes too complex and starts to fit the training data too well, but fails to generalize well to new, unseen data. Here are some characteristics of overfitting in deep learning:
- High training accuracy: the model achieves very high accuracy on the training set.
- Low validation accuracy: the model has low accuracy on the validation set, which indicates that it is not generalizing well to new data.
- Large gap between training and validation accuracy: there is a large difference between the training accuracy and the validation accuracy, which indicates that the model is memorizing the training data rather than learning the underlying patterns.
- Poor performance on test data: when tested on new, unseen data, the model performs poorly, indicating that it is not able to generalize well beyond the training data.
What is Regularization?¶
- Regularization is a technique used in deep learning to prevent overfitting of the model. Overfitting occurs when the model fits too well to the training data and fails to generalize well to new, unseen data.
- Regularization adds a penalty term to the loss function of the model, which encourages the model to learn simpler, smoother decision boundaries rather than complex, jagged ones. This helps the model to avoid overfitting and generalize better to new data.
Types of Regularization¶
L1 and L2 Regularization: add a penalty term to the loss function that encourages the model to learn smaller weights
Dropout: randomly drops out some of the neurons during training to prevent co-adaptation of neurons and encourage the model to learn more robust features
For more information about Dropout, please check here https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dropout
Early Stopping: stops the training process when the validation loss stops improving to prevent overfitting
Data Augmentation: artificially increases the size of the training set by applying random transformations to the data to improve the generalization ability of the model
Batch Normalization: normalizes the input data to each layer of the network, which helps to prevent overfitting and improve the training speed
Image augmentation: a technique that can be used to artificially expand the size of a training dataset by creating modified versions of images in the dataset.
For more information about ImageDataGenerator, please check the official documentation: https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator
datagen = ImageDataGenerator(rescale=1/255, validation_split=0.2, rotation_range=10,
width_shift_range=0.1, height_shift_range=0.1,
shear_range=0.1, zoom_range=0.10, horizontal_flip=True)
train_generator = datagen.flow_from_directory('/content/cats_vs_dogs/train',
target_size = (150, 150),
batch_size=32,
class_mode='binary',
subset='training')
validation_generator = datagen.flow_from_directory('/content/cats_vs_dogs/train',
target_size = (150, 150),
batch_size=32,
class_mode='binary',
subset='validation')
Building Convolutions Neural Network (CNN) Model with Image Augmentation¶
- CNN Building Blocks
- Input Layer
- Convolutional Layer
- Pooling Layer
- Dropout Layer
- Batch Normalization Layer
- Activation Layer
- Fully Connected Layer
- Flatten Layer
- Output Layer
from keras.backend import batch_normalization
model = Sequential()
# 1st layer CNN
model.add(Conv2D(32, kernel_size=3, activation='relu', input_shape=(150, 150, 3)))
model.add(MaxPooling2D(2))
model.add(BatchNormalization())
model.add(Dropout(0.2))
# 2nd layer CNN
model.add(Conv2D(64, kernel_size=3, activation='relu'))
model.add(MaxPooling2D(2))
model.add(BatchNormalization())
model.add(Dropout(0.2))
# 3rd Layer
model.add(Conv2D(128, kernel_size=3, activation='relu'))
model.add(MaxPooling2D(2))
model.add(BatchNormalization())
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(512, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.summary()
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 148, 148, 32) 896 max_pooling2d (MaxPooling2 (None, 74, 74, 32) 0 D) conv2d_1 (Conv2D) (None, 72, 72, 64) 18496 max_pooling2d_1 (MaxPoolin (None, 36, 36, 64) 0 g2D) conv2d_2 (Conv2D) (None, 34, 34, 128) 73856 max_pooling2d_2 (MaxPoolin (None, 17, 17, 128) 0 g2D) flatten (Flatten) (None, 36992) 0 dropout (Dropout) (None, 36992) 0 dense (Dense) (None, 512) 18940416 dense_1 (Dense) (None, 1) 513 ================================================================= Total params: 19034177 (72.61 MB) Trainable params: 19034177 (72.61 MB) Non-trainable params: 0 (0.00 Byte) _________________________________________________________________
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(train_generator, epochs=10, validation_data=validation_generator)
Epoch 1/10 582/582 [==============================] - 1207s 2s/step - loss: 0.6556 - accuracy: 0.6181 - val_loss: 0.6249 - val_accuracy: 0.6237 Epoch 2/10 582/582 [==============================] - 1245s 2s/step - loss: 0.5548 - accuracy: 0.7135 - val_loss: 0.5331 - val_accuracy: 0.7334 Epoch 3/10 582/582 [==============================] - 1242s 2s/step - loss: 0.5061 - accuracy: 0.7513 - val_loss: 0.4657 - val_accuracy: 0.7790 Epoch 4/10 582/582 [==============================] - 1226s 2s/step - loss: 0.4628 - accuracy: 0.7807 - val_loss: 0.4379 - val_accuracy: 0.7899 Epoch 5/10 582/582 [==============================] - 1224s 2s/step - loss: 0.4440 - accuracy: 0.7913 - val_loss: 0.4302 - val_accuracy: 0.8033 Epoch 6/10 582/582 [==============================] - 1236s 2s/step - loss: 0.4205 - accuracy: 0.8079 - val_loss: 0.4180 - val_accuracy: 0.8052 Epoch 7/10 582/582 [==============================] - 1238s 2s/step - loss: 0.4022 - accuracy: 0.8170 - val_loss: 0.3939 - val_accuracy: 0.8237 Epoch 8/10 582/582 [==============================] - 1231s 2s/step - loss: 0.3840 - accuracy: 0.8245 - val_loss: 0.3596 - val_accuracy: 0.8381 Epoch 9/10 582/582 [==============================] - 1237s 2s/step - loss: 0.3755 - accuracy: 0.8293 - val_loss: 0.3485 - val_accuracy: 0.8557 Epoch 10/10 582/582 [==============================] - 1239s 2s/step - loss: 0.3590 - accuracy: 0.8400 - val_loss: 0.3366 - val_accuracy: 0.8527
history.history
plt.plot(history.history['accuracy'], label='Training')
plt.plot(history.history['val_accuracy'], label='Validation')
plt.legend(['Training', 'Validation'])
<matplotlib.legend.Legend at 0x7839e80c70a0>
model.save('cats_vs_dogs.h5')
/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py:3103: UserWarning: You are saving your model as an HDF5 file via `model.save()`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')`. saving_api.save_model(
model_load = tf.keras.models.load_model('cats_vs_dogs.h5')
import requests
from PIL import Image
from tensorflow.keras.preprocessing import image
#testing our model
img_url = "https://i.natgeofe.com/n/548467d8-c5f1-4551-9f58-6817a8d2c45e/NationalGeographic_2572187_square.jpg"
img = Image.open(requests.get(img_url, stream=True).raw).resize((150, 150))
image_array = image.img_to_array(img)
img = np.expand_dims(image_array, axis=0)
img = img/255
prediction = model.predict(img)
TH = 0.5
prediction = int(prediction[0][0]>TH)
classes = {v:k for k,v in train_generator.class_indices.items()}
classes[prediction]
1/1 [==============================] - 0s 201ms/step
'cat'