Computer Vision & Image Processing¶

How does Computer Vision work¶

In [ ]:

Copied!

import numpy as np #Python library used for working with arrays
import matplotlib.pyplot as plt #Library for interactive vizualizations in python
from PIL import Image #Python Imaging Library
import numpy as np #Python library used for working with arrays
import matplotlib.pyplot as plt #Library for interactive vizualizations in python
from PIL import Image #Python Imaging Library

In [ ]:

Copied!

pic = Image.open('sample_data/codechamp.png')
pic
pic = Image.open('sample_data/codechamp.png')
pic

Out[ ]:

No description has been provided for this image

In [ ]:

Copied!

type(pic)
type(pic)

Out[ ]:

PIL.PngImagePlugin.PngImageFile
def __init__(fp=None, filename=None)

/usr/local/lib/python3.10/dist-packages/PIL/PngImagePlugin.pyBase class for image file format handlers.

In [ ]:

Copied!

pic_arr = np.asarray(pic)
print(pic_arr)
type(pic_arr)
pic_arr = np.asarray(pic)
print(pic_arr)
type(pic_arr)

[[[ 14  14  14 255]
  [ 14  14  14 255]
  [ 14  14  14 255]
  ...
  [ 25  57  80 255]
  [ 25  57  80 255]
  [ 14  14  14 255]]

 [[ 14  14  14 255]
  [ 14  14  14 255]
  [ 14  14  14 255]
  ...
  [ 25  57  80 255]
  [ 25  57  80 255]
  [ 14  14  14 255]]

 [[ 14  14  14 255]
  [ 14  14  14 255]
  [ 14  14  14 255]
  ...
  [ 25  57  80 255]
  [ 25  57  80 255]
  [ 14  14  14 255]]

 ...

 [[ 14  14  14 255]
  [ 14  14  14 255]
  [ 14  14  14 255]
  ...
  [ 14  14  14 255]
  [ 14  14  14 255]
  [ 14  14  14 255]]

 [[ 14  14  14 255]
  [ 14  14  14 255]
  [ 14  14  14 255]
  ...
  [ 14  14  14 255]
  [ 14  14  14 255]
  [ 14  14  14 255]]

 [[ 14  14  14 255]
  [ 14  14  14 255]
  [ 14  14  14 255]
  ...
  [ 14  14  14 255]
  [ 14  14  14 255]
  [ 14  14  14 255]]]

Out[ ]:

numpy.ndarray

Dimensions of the picture

In [ ]:

Copied!

pic_arr.shape
pic_arr.shape

Out[ ]:

(1276, 1272, 4)

Graph plotting of the picture

In [ ]:

Copied!

plt.imshow(pic_arr)
plt.imshow(pic_arr)

Out[ ]:

<matplotlib.image.AxesImage at 0x7fa031a5b430>

In [ ]:

Copied!





import cv2
img1 = cv2.imread('sample_data/dog_backpack.png')
img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2RGB)
img2 = cv2.imread('sample_data/do_not_copy.png')
img21= cv2.cvtColor(img2, cv2.COLOR_BGR2RGB)

plt.imshow(img1)
img1.shape
import cv2
img1 = cv2.imread('sample_data/dog_backpack.png')
img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2RGB)
img2 = cv2.imread('sample_data/do_not_copy.png')
img21= cv2.cvtColor(img2, cv2.COLOR_BGR2RGB)

plt.imshow(img1)
img1.shape

Out[ ]:

(901, 1198, 3)

In [ ]:

Copied!

plt.imshow(img2)
img2.shape
plt.imshow(img2)
img2.shape

Out[ ]:

(1081, 1077, 3)

Resizing images

In [ ]:

Copied!

img1=cv2.resize(img1,(1200,1200))
img2=cv2.resize(img2,(1200,1200))
img1=cv2.resize(img1,(1200,1200))
img2=cv2.resize(img2,(1200,1200))

BLENDING IMAGES OF THE SAME SIZE¶

In [ ]:

Copied!

blended = cv2.addWeighted(src1=img1, alpha=0.9, src2=img2, beta=0.3, gamma=0)
plt.imshow(blended)
blended = cv2.addWeighted(src1=img1, alpha=0.9, src2=img2, beta=0.3, gamma=0)
plt.imshow(blended)

Out[ ]:

<matplotlib.image.AxesImage at 0x7fa031a3b5e0>

Blurring¶

Gaussian Blur: Applies a Gaussian filter for smoother blurring. Preserves edges better than averaging blur.

In [ ]:

Copied!

# Importing Image and ImageFilter module from PIL package
from PIL import Image, ImageFilter

# creating a image object
im1 = Image.open("sample_data/leave.jpg")

# applying the Gaussian Blur filter
im2 = im1.filter(ImageFilter.GaussianBlur(radius = 2))

plt.imshow(im2)
# Importing Image and ImageFilter module from PIL package
from PIL import Image, ImageFilter

# creating a image object
im1 = Image.open("sample_data/leave.jpg")

# applying the Gaussian Blur filter
im2 = im1.filter(ImageFilter.GaussianBlur(radius = 2))

plt.imshow(im2)

Computer Vision Challenges¶

Challenges to becoming a leading technology:

o Reasoning and analytical issues: To become a computer vision expert, you must have strong reasoning and analytical skills.

o Privacy and security: Vision-powered surveillance is also having various serious privacy issues for lots of countries. It restricts users from accessing unauthorized content. Further, various countries also avoid such face recognition and detection techniques for privacy and security reasons.

o Duplicate and false content: A data breach can lead to serious problems, such as creating duplicate images and videos over the internet.

`DEEP LEARNING`¶

Dataset Details¶

Dogs and Cats Dataset¶

https://www.tensorflow.org/datasets/catalog/cats_vs_dogs

The Dogs and Cats dataset contains around 23k images of dogs and cats.
We will be downloading the dataset from the TensorFlow Datasets library.
The dataset is already split into training and validation sets, so we don't need to split it manually.
The dataset is also already preprocessed and all images are of the same size (256 x 256 x 3).
For better understanding of the dataset, I would be downloading the dataset and then visualizing it.

In [ ]:

Copied!

!pip install tensorflow_datasets
!pip install tensorflow_datasets

Requirement already satisfied: tensorflow_datasets in /usr/local/lib/python3.10/dist-packages (4.9.4)
Requirement already satisfied: absl-py in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (1.4.0)
Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (8.1.7)
Requirement already satisfied: dm-tree in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (0.1.8)
Requirement already satisfied: etils[enp,epath,etree]>=0.9.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (1.7.0)
Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (1.25.2)
Requirement already satisfied: promise in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (2.3)
Requirement already satisfied: protobuf>=3.20 in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (3.20.3)
Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (5.9.5)
Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (2.31.0)
Requirement already satisfied: tensorflow-metadata in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (1.14.0)
Requirement already satisfied: termcolor in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (2.4.0)
Requirement already satisfied: toml in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (0.10.2)
Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (4.66.2)
Requirement already satisfied: wrapt in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (1.14.1)
Requirement already satisfied: array-record>=0.5.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow_datasets) (0.5.0)
Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from etils[enp,epath,etree]>=0.9.0->tensorflow_datasets) (2023.6.0)
Requirement already satisfied: importlib_resources in /usr/local/lib/python3.10/dist-packages (from etils[enp,epath,etree]>=0.9.0->tensorflow_datasets) (6.1.1)
Requirement already satisfied: typing_extensions in /usr/local/lib/python3.10/dist-packages (from etils[enp,epath,etree]>=0.9.0->tensorflow_datasets) (4.9.0)
Requirement already satisfied: zipp in /usr/local/lib/python3.10/dist-packages (from etils[enp,epath,etree]>=0.9.0->tensorflow_datasets) (3.17.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->tensorflow_datasets) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->tensorflow_datasets) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->tensorflow_datasets) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->tensorflow_datasets) (2024.2.2)
Requirement already satisfied: six in /usr/local/lib/python3.10/dist-packages (from promise->tensorflow_datasets) (1.16.0)
Requirement already satisfied: googleapis-common-protos<2,>=1.52.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow-metadata->tensorflow_datasets) (1.62.0)

In [ ]:

Copied!





import tensorflow as tf
import tensorflow_datasets as tfds
import os

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow_datasets as tfds
import os

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [ ]:

Copied!

dataset, info = tfds.load('cats_vs_dogs', with_info=True, as_supervised=True)
dataset, info = tfds.load('cats_vs_dogs', with_info=True, as_supervised=True)

Downloading and preparing dataset 786.67 MiB (download: 786.67 MiB, generated: 1.04 GiB, total: 1.81 GiB) to /root/tensorflow_datasets/cats_vs_dogs/4.0.1...

Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Generating splits...:   0%|          | 0/1 [00:00<?, ? splits/s]

Generating train examples...:   0%|          | 0/23262 [00:00<?, ? examples/s]

WARNING:absl:1738 images were corrupted and were skipped

Shuffling /root/tensorflow_datasets/cats_vs_dogs/4.0.1.incompleteTM33FF/cats_vs_dogs-train.tfrecord*...:   0%|…

Dataset cats_vs_dogs downloaded and prepared to /root/tensorflow_datasets/cats_vs_dogs/4.0.1. Subsequent calls will reuse this data.

In [ ]:

Copied!

class_names = info.features['label'].names
class_names
class_names = info.features['label'].names
class_names

Out[ ]:

['cat', 'dog']

In [ ]:

Copied!





for i, example in enumerate(dataset['train']):
  # example = (image, label)
  image, label = example
  save_dir = './cats_vs_dogs/train/{}'.format(class_names[label])
  os.makedirs(save_dir, exist_ok=True)

  filename = save_dir + "/" + "{}_{}.jpg".format(class_names[label], i)
  tf.keras.preprocessing.image.save_img(filename, image.numpy())
for i, example in enumerate(dataset['train']):
  # example = (image, label)
  image, label = example
  save_dir = './cats_vs_dogs/train/{}'.format(class_names[label])
  os.makedirs(save_dir, exist_ok=True)

  filename = save_dir + "/" + "{}_{}.jpg".format(class_names[label], i)
  tf.keras.preprocessing.image.save_img(filename, image.numpy())

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-4-3103315d82e1> in <cell line: 1>()
      2   # example = (image, label)
      3   image, label = example
----> 4   save_dir = './cats_vs_dogs/train/{}'.format(class_names[label])
      5   os.makedirs(save_dir, exist_ok=True)
      6 

NameError: name 'class_names' is not defined

In [ ]:

Copied!

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, BatchNormalization
from tensorflow.keras.models import Sequential
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, BatchNormalization
from tensorflow.keras.models import Sequential

What is Overfitting?¶

Overfitting is a common problem in deep learning where the model becomes too complex and starts to fit the training data too well, but fails to generalize well to new, unseen data. Here are some characteristics of overfitting in deep learning:

High training accuracy: the model achieves very high accuracy on the training set.
Low validation accuracy: the model has low accuracy on the validation set, which indicates that it is not generalizing well to new data.
Large gap between training and validation accuracy: there is a large difference between the training accuracy and the validation accuracy, which indicates that the model is memorizing the training data rather than learning the underlying patterns.
Poor performance on test data: when tested on new, unseen data, the model performs poorly, indicating that it is not able to generalize well beyond the training data.

What is Regularization?¶

Regularization is a technique used in deep learning to prevent overfitting of the model. Overfitting occurs when the model fits too well to the training data and fails to generalize well to new, unseen data.
Regularization adds a penalty term to the loss function of the model, which encourages the model to learn simpler, smoother decision boundaries rather than complex, jagged ones. This helps the model to avoid overfitting and generalize better to new data.

Types of Regularization¶

L1 and L2 Regularization: add a penalty term to the loss function that encourages the model to learn smaller weights
Dropout: randomly drops out some of the neurons during training to prevent co-adaptation of neurons and encourage the model to learn more robust features

For more information about Dropout, please check here https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dropout
Early Stopping: stops the training process when the validation loss stops improving to prevent overfitting
Data Augmentation: artificially increases the size of the training set by applying random transformations to the data to improve the generalization ability of the model
Batch Normalization: normalizes the input data to each layer of the network, which helps to prevent overfitting and improve the training speed
Image augmentation: a technique that can be used to artificially expand the size of a training dataset by creating modified versions of images in the dataset.

For more information about ImageDataGenerator, please check the official documentation: https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator

$image.png$

In [ ]:

Copied!





datagen = ImageDataGenerator(rescale=1/255, validation_split=0.2, rotation_range=10,
                              width_shift_range=0.1, height_shift_range=0.1,
                             shear_range=0.1, zoom_range=0.10, horizontal_flip=True)

train_generator = datagen.flow_from_directory('/content/cats_vs_dogs/train',
                                              target_size = (150, 150),
                                              batch_size=32,
                                              class_mode='binary',
                                              subset='training')

validation_generator = datagen.flow_from_directory('/content/cats_vs_dogs/train',
                                              target_size = (150, 150),
                                              batch_size=32,
                                              class_mode='binary',
                                              subset='validation')
datagen = ImageDataGenerator(rescale=1/255, validation_split=0.2, rotation_range=10,
                              width_shift_range=0.1, height_shift_range=0.1,
                             shear_range=0.1, zoom_range=0.10, horizontal_flip=True)

train_generator = datagen.flow_from_directory('/content/cats_vs_dogs/train',
                                              target_size = (150, 150),
                                              batch_size=32,
                                              class_mode='binary',
                                              subset='training')

validation_generator = datagen.flow_from_directory('/content/cats_vs_dogs/train',
                                              target_size = (150, 150),
                                              batch_size=32,
                                              class_mode='binary',
                                              subset='validation')

Building Convolutions Neural Network (CNN) Model with Image Augmentation¶

https://poloclub.github.io/cnn-explainer/

CNN Building Blocks
- Input Layer
- Convolutional Layer
- Pooling Layer
- Dropout Layer
- Batch Normalization Layer
- Activation Layer
- Fully Connected Layer
- Flatten Layer
- Output Layer

In [ ]:

Copied!





from keras.backend import batch_normalization
model = Sequential()

# 1st layer CNN
model.add(Conv2D(32, kernel_size=3, activation='relu', input_shape=(150, 150, 3)))
model.add(MaxPooling2D(2))
model.add(BatchNormalization())
model.add(Dropout(0.2))

# 2nd layer CNN
model.add(Conv2D(64, kernel_size=3, activation='relu'))
model.add(MaxPooling2D(2))
model.add(BatchNormalization())
model.add(Dropout(0.2))

# 3rd Layer
model.add(Conv2D(128, kernel_size=3, activation='relu'))
model.add(MaxPooling2D(2))
model.add(BatchNormalization())
model.add(Dropout(0.2))

model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(512, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
from keras.backend import batch_normalization
model = Sequential()

# 1st layer CNN
model.add(Conv2D(32, kernel_size=3, activation='relu', input_shape=(150, 150, 3)))
model.add(MaxPooling2D(2))
model.add(BatchNormalization())
model.add(Dropout(0.2))

# 2nd layer CNN
model.add(Conv2D(64, kernel_size=3, activation='relu'))
model.add(MaxPooling2D(2))
model.add(BatchNormalization())
model.add(Dropout(0.2))

# 3rd Layer
model.add(Conv2D(128, kernel_size=3, activation='relu'))
model.add(MaxPooling2D(2))
model.add(BatchNormalization())
model.add(Dropout(0.2))

model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(512, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

In [ ]:

Copied!

model.summary()
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 148, 148, 32)      896       
                                                                 
 max_pooling2d (MaxPooling2  (None, 74, 74, 32)        0         
 D)                                                              
                                                                 
 conv2d_1 (Conv2D)           (None, 72, 72, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 36, 36, 64)        0         
 g2D)                                                            
                                                                 
 conv2d_2 (Conv2D)           (None, 34, 34, 128)       73856     
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 17, 17, 128)       0         
 g2D)                                                            
                                                                 
 flatten (Flatten)           (None, 36992)             0         
                                                                 
 dropout (Dropout)           (None, 36992)             0         
                                                                 
 dense (Dense)               (None, 512)               18940416  
                                                                 
 dense_1 (Dense)             (None, 1)                 513       
                                                                 
=================================================================
Total params: 19034177 (72.61 MB)
Trainable params: 19034177 (72.61 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

In [ ]:

Copied!

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

history = model.fit(train_generator, epochs=10, validation_data=validation_generator)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

history = model.fit(train_generator, epochs=10, validation_data=validation_generator)

Epoch 1/10
582/582 [==============================] - 1207s 2s/step - loss: 0.6556 - accuracy: 0.6181 - val_loss: 0.6249 - val_accuracy: 0.6237
Epoch 2/10
582/582 [==============================] - 1245s 2s/step - loss: 0.5548 - accuracy: 0.7135 - val_loss: 0.5331 - val_accuracy: 0.7334
Epoch 3/10
582/582 [==============================] - 1242s 2s/step - loss: 0.5061 - accuracy: 0.7513 - val_loss: 0.4657 - val_accuracy: 0.7790
Epoch 4/10
582/582 [==============================] - 1226s 2s/step - loss: 0.4628 - accuracy: 0.7807 - val_loss: 0.4379 - val_accuracy: 0.7899
Epoch 5/10
582/582 [==============================] - 1224s 2s/step - loss: 0.4440 - accuracy: 0.7913 - val_loss: 0.4302 - val_accuracy: 0.8033
Epoch 6/10
582/582 [==============================] - 1236s 2s/step - loss: 0.4205 - accuracy: 0.8079 - val_loss: 0.4180 - val_accuracy: 0.8052
Epoch 7/10
582/582 [==============================] - 1238s 2s/step - loss: 0.4022 - accuracy: 0.8170 - val_loss: 0.3939 - val_accuracy: 0.8237
Epoch 8/10
582/582 [==============================] - 1231s 2s/step - loss: 0.3840 - accuracy: 0.8245 - val_loss: 0.3596 - val_accuracy: 0.8381
Epoch 9/10
582/582 [==============================] - 1237s 2s/step - loss: 0.3755 - accuracy: 0.8293 - val_loss: 0.3485 - val_accuracy: 0.8557
Epoch 10/10
582/582 [==============================] - 1239s 2s/step - loss: 0.3590 - accuracy: 0.8400 - val_loss: 0.3366 - val_accuracy: 0.8527

In [ ]:

Copied!

history.history

plt.plot(history.history['accuracy'], label='Training')
plt.plot(history.history['val_accuracy'], label='Validation')
plt.legend(['Training', 'Validation'])
history.history

plt.plot(history.history['accuracy'], label='Training')
plt.plot(history.history['val_accuracy'], label='Validation')
plt.legend(['Training', 'Validation'])

Out[ ]:

<matplotlib.legend.Legend at 0x7839e80c70a0>

In [ ]:

Copied!

model.save('cats_vs_dogs.h5')
model.save('cats_vs_dogs.h5')

/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py:3103: UserWarning: You are saving your model as an HDF5 file via `model.save()`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')`.
  saving_api.save_model(

In [ ]:

Copied!

model_load = tf.keras.models.load_model('cats_vs_dogs.h5')
model_load = tf.keras.models.load_model('cats_vs_dogs.h5')

In [ ]:

Copied!





import requests
from PIL import Image
from tensorflow.keras.preprocessing import image
#testing our model
img_url = "https://i.natgeofe.com/n/548467d8-c5f1-4551-9f58-6817a8d2c45e/NationalGeographic_2572187_square.jpg"
img = Image.open(requests.get(img_url, stream=True).raw).resize((150, 150))

image_array = image.img_to_array(img)

img = np.expand_dims(image_array, axis=0)

img = img/255

prediction = model.predict(img)

TH = 0.5
prediction = int(prediction[0][0]>TH)
classes = {v:k for k,v in train_generator.class_indices.items()}
classes[prediction]
import requests
from PIL import Image
from tensorflow.keras.preprocessing import image
#testing our model
img_url = "https://i.natgeofe.com/n/548467d8-c5f1-4551-9f58-6817a8d2c45e/NationalGeographic_2572187_square.jpg"
img = Image.open(requests.get(img_url, stream=True).raw).resize((150, 150))

image_array = image.img_to_array(img)

img = np.expand_dims(image_array, axis=0)

img = img/255

prediction = model.predict(img)

TH = 0.5
prediction = int(prediction[0][0]>TH)
classes = {v:k for k,v in train_generator.class_indices.items()}
classes[prediction]

1/1 [==============================] - 0s 201ms/step

Out[ ]:

'cat'

Computer Vision & Image Processing¶

How does Computer Vision work¶

BLENDING IMAGES OF THE SAME SIZE¶

Blurring¶

Computer Vision Challenges¶

DEEP LEARNING¶

Dataset Details¶

Dogs and Cats Dataset¶

What is Overfitting?¶

What is Regularization?¶

Types of Regularization¶

Building Convolutions Neural Network (CNN) Model with Image Augmentation¶

`DEEP LEARNING`¶