Image Classification using MNIST¶
Author: Tianxiang (Adam) Gao
Course: CSC 383/483: Applied Deep Learning
Description: In this assignment, you will build a simple image classifier for the MNIST handwritten digit dataset using the Keras deep learning library. The MNIST dataset consists of 70,000 grayscale images of handwritten digits (0–9), each of size 28×28 pixels.
Setup¶
We will first import some useful libraries:
numpyfor numerical operations (e.g., arrays, random sampling).kerasfor loading the MNIST dataset and building deep learning models.keras.layersprovides the building blocks (dense layers, convolutional layers, activation functions, etc.) to design neural networks.matplotlibfor visualizing images and plotting graphs.
import numpy as np
import keras
from keras import layers
import matplotlib.pyplot as plt
Prepare the data [10/10]¶
Use
keras.datasets.mnist.load_data()to load training and testing data. Name themx_train, y_train, x_test, y_test. Print the shape of bothx_trainandx_testto confirm the number of samples and image dimensions.Convert pixel values from integers in the range 0–255 to floating-point numbers between 0 and 1 (normalize). Use
np.expand_dims(data, -1)to reshape the arrays so that each image has an explicit channel dimension (since MNIST images are grayscale).Print the first 10 labels from
y_trainto see their raw integer values (0–9). Convert bothy_trainandy_testinto one-hot encoded vectors usingkeras.utils.to_categorical. Print the first 10 labels again to observe the difference between integer labels and one-hot encoded labels.
num_classes = 10
# input_shape = (28, 28, 1) # 1 channel (grayscale)
(x_train, y_train), (x_test, y_test) = # load data
print("x_train shape:", x_train.shape)
x_train = # normalize
x_test = # normalize
x_train = # reshape
x_test = # reshape
print("x_train shape:", x_train.shape)
print("first 10 labels:", y_train[0:10])
y_train = # one-hot
y_test = # one-hot
print("first 10 labels:", y_train[0:10])
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz 11490434/11490434 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step x_train shape: (60000, 28, 28) x_train shape: (60000, 28, 28, 1) first 10 labels: [5 0 4 1 9 2 1 3 1 4] first 10 labels: [[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.] [1. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.] [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 1. 0. 0. 0. 0. 0. 0.] [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]]
Visualize the data [10/10]¶
- Randomly pick 9 images from the training set
x_train. Display them in a 3×3 grid using Matplotlib (plt.subplot). For each image, show its corresponding digit label (fromy_train) as the subplot title.
indices = # random pick 9 indices
plt.figure(figsize=(6, 6))
for i, idx in enumerate(indices):
plt.subplot(3, 3, i + 1)
# squeeze last channel back to 2D for grayscale display
plt.imshow()
# show the integer label, not one-hot
plt.title()
plt.axis("off")
plt.tight_layout()
plt.show()
Build the model [50/50]¶
Recall that each MNIST image has shape
(28, 28, 1)(height, width, and 1 grayscale channel). Assign this shape to a variableinput_shape.Use
keras.Sequentialto build a simple two-layer MLP with the following layers:- Input layer: accepts images of shape
input_shape. - Flatten layer: converts each 2D image into a 1D vector.
- Dense layer: fully connected layer with 128 hidden units and a
"sigmoid"activation function. - Output layer: fully connected layer with
num_classesunits (one for each digit 0–9) and"softmax"activation.
- Input layer: accepts images of shape
Inspect the model: Call
model.summary()to display the network architecture, output shapes, and number of parameters in each layer.
input_shape = # define input shape
model = keras.Sequential([
# Input
# Flatten
# Dense with "sigmoid"
# Dense with "softmax"
])
# summary of the model
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ flatten (Flatten) │ (None, 784) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense (Dense) │ (None, 128) │ 100,480 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 10) │ 1,290 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 101,770 (397.54 KB)
Trainable params: 101,770 (397.54 KB)
Non-trainable params: 0 (0.00 B)
Train the model [20/20]¶
Set training parameters: Choose a
batch_sizeof 128 and train for 20epochs.- Batch size controls how many training examples are processed before updating model weights.
- Epochs represent how many times the model will see the entire training dataset.
Compile the model: Use the following components:
- Loss function:
CategoricalCrossentropy(since we have multi-class classification with one-hot labels). - Optimizer:
SGD(stochastic gradient descent) with alearning_rateof 0.02 - Metrics: track
CategoricalAccuracyduring training and validation.
- Loss function:
Train the model
- Use
model.fit()with the given batch size and epochs. - Set
validation_split=0.1so that 10% of the training data is held out for validation at the end of each epoch. - Observe the training and validation loss/accuracy printed after each epoch.
- Use
batch_size =
pochs =
model.compile(
loss= # CategoricalCrossentropy()
optimizer= # SGD
metrics=[
# CategoricalAccuracy()
]
)
model.fit()
Epoch 1/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.4257 - loss: 2.0869 - val_categorical_accuracy: 0.7995 - val_loss: 1.3782 Epoch 2/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.7783 - loss: 1.2703 - val_categorical_accuracy: 0.8508 - val_loss: 0.8863 Epoch 3/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.8221 - loss: 0.8914 - val_categorical_accuracy: 0.8770 - val_loss: 0.6632 Epoch 4/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.8481 - loss: 0.6984 - val_categorical_accuracy: 0.8892 - val_loss: 0.5471 Epoch 5/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 3s 6ms/step - categorical_accuracy: 0.8576 - loss: 0.6094 - val_categorical_accuracy: 0.8943 - val_loss: 0.4766 Epoch 6/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.8705 - loss: 0.5411 - val_categorical_accuracy: 0.9018 - val_loss: 0.4299 Epoch 7/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.8752 - loss: 0.4957 - val_categorical_accuracy: 0.9075 - val_loss: 0.3970 Epoch 8/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - categorical_accuracy: 0.8822 - loss: 0.4629 - val_categorical_accuracy: 0.9113 - val_loss: 0.3728 Epoch 9/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.8856 - loss: 0.4392 - val_categorical_accuracy: 0.9130 - val_loss: 0.3546 Epoch 10/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.8912 - loss: 0.4159 - val_categorical_accuracy: 0.9145 - val_loss: 0.3397 Epoch 11/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 3s 5ms/step - categorical_accuracy: 0.8896 - loss: 0.4073 - val_categorical_accuracy: 0.9150 - val_loss: 0.3278 Epoch 12/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.8924 - loss: 0.3951 - val_categorical_accuracy: 0.9162 - val_loss: 0.3180 Epoch 13/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.8954 - loss: 0.3827 - val_categorical_accuracy: 0.9178 - val_loss: 0.3098 Epoch 14/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.8957 - loss: 0.3790 - val_categorical_accuracy: 0.9205 - val_loss: 0.3027 Epoch 15/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - categorical_accuracy: 0.8977 - loss: 0.3651 - val_categorical_accuracy: 0.9207 - val_loss: 0.2963 Epoch 16/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - categorical_accuracy: 0.9003 - loss: 0.3606 - val_categorical_accuracy: 0.9222 - val_loss: 0.2910 Epoch 17/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 3s 7ms/step - categorical_accuracy: 0.9000 - loss: 0.3565 - val_categorical_accuracy: 0.9223 - val_loss: 0.2861 Epoch 18/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.9018 - loss: 0.3472 - val_categorical_accuracy: 0.9238 - val_loss: 0.2819 Epoch 19/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.9018 - loss: 0.3476 - val_categorical_accuracy: 0.9242 - val_loss: 0.2778 Epoch 20/20 422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.9047 - loss: 0.3391 - val_categorical_accuracy: 0.9255 - val_loss: 0.2742
<keras.src.callbacks.history.History at 0x7cb357eb1190>
Evaluate the trained model on the test set [10/10]¶
- After training, use
model.evaluate(x_test, y_test)to measure how well the model generalizes to unseen data. Store the result in a variable score. Print both the test loss (score[0]) and the test accuracy (score[1]).
score = # model evaluate
print("Test loss:", score[0])
print("Test accuracy:", score[1])
Test loss: 0.3149966299533844 Test accuracy: 0.9133999943733215