Image Recognition
Learning Objectives
Tools > Jupyter Notebook
Supervised Learning for Computer Vision
In constrast, supervised learning involves training algorithms on labeled datasets,
Table 4.1: Challenges of visual data classification
Analyzing the data to extract informative patterns and features.
Loading and Preprocessing Images
The imread() function creates an "RGB" format of the image.
Resizing has the effect of converting RGB images to a float-based format:
The first image has a shape of (100, 100, 4). The "4" reveals
Grayscale images have only one channel (rather than the 3 RGB channels).
The next step is to convert the resized_images and labels lists to numpy arrays, which is expected
Prediction without Feature Engineering
The following code can be used to "flatten" each image into a one-dimensional vector.
Figure 4.7: Confusion matrix of MultinomialNB algorithm performance
The MultinomialNB algorithm achieves an accuracy around 30%.
MultinomialNB
SGDClassifier
Standard scaling
The following code uses the StandardScaler tool from the sklearn library to scale the data:
Prediction with Feature Selection
The next step is to create the HOG features for each image in the data.
A new SGDClassifier can now be trained on this new representation:
Prediction Using Neural Networks
The Sequential tool from the Keras library can now be used to build a neural network as a sequence of layers.
Given that the output layer has 16 neurons that are fully connected
Table 4.2: The arguments of the "compile" method
The fit() method is used to train a model on a given set of input data and labels.
Table 4.3: The arguments of the "fit" method
The trained model can now be used to predict the labels of the images in the testing set:
Prediction Using Convolutional Neural Networks
Despite the benefits of complex neural networks like CNNs, it is important to note that:
One of the key advantages of CNNs is that they are very good
Figure 4.14: Convolutional neural network without manual feature engineering
Transfer Learning
What are the challenges of visual data classification?
You are given two numpy arrays X_train and y_train. Each row in X_train has a shape
Descibe briefly how CNNs work and one of their key advantages.
You are given two numpy arrays X_train and y_train. Each row in X_train has a shape
Name some challenges of CNNs.