Image Classification using MNIST¶

Author: Tianxiang (Adam) Gao
Course: CSC 383/483: Applied Deep Learning
Description: In this assignment, you will build a simple image classifier for the MNIST handwritten digit dataset using the Keras deep learning library. The MNIST dataset consists of 70,000 grayscale images of handwritten digits (0–9), each of size 28×28 pixels.

Setup¶

We will first import some useful libraries:

  • numpy for numerical operations (e.g., arrays, random sampling).
  • keras for loading the MNIST dataset and building deep learning models.
  • keras.layers provides the building blocks (dense layers, convolutional layers, activation functions, etc.) to design neural networks.
  • matplotlib for visualizing images and plotting graphs.
In [1]:
import numpy as np
import keras
from keras import layers
import matplotlib.pyplot as plt

Prepare the data [10/10]¶

  1. Use keras.datasets.mnist.load_data() to load training and testing data. Name them x_train, y_train, x_test, y_test. Print the shape of both x_train and x_test to confirm the number of samples and image dimensions.

  2. Convert pixel values from integers in the range 0–255 to floating-point numbers between 0 and 1 (normalize). Use np.expand_dims(data, -1) to reshape the arrays so that each image has an explicit channel dimension (since MNIST images are grayscale).

  3. Print the first 10 labels from y_train to see their raw integer values (0–9). Convert both y_train and y_test into one-hot encoded vectors using keras.utils.to_categorical. Print the first 10 labels again to observe the difference between integer labels and one-hot encoded labels.

In [2]:
num_classes = 10
# input_shape = (28, 28, 1) # 1 channel (grayscale)

(x_train, y_train), (x_test, y_test) = # load data
print("x_train shape:", x_train.shape)

x_train = # normalize
x_test = # normalize

x_train = # reshape
x_test = # reshape
print("x_train shape:", x_train.shape)

print("first 10 labels:", y_train[0:10])
y_train = # one-hot
y_test = # one-hot
print("first 10 labels:", y_train[0:10])
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11490434/11490434 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
x_train shape: (60000, 28, 28)
x_train shape: (60000, 28, 28, 1)
first 10 labels: [5 0 4 1 9 2 1 3 1 4]
first 10 labels: [[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
 [1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
 [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]]

Visualize the data [10/10]¶

  1. Randomly pick 9 images from the training set x_train. Display them in a 3×3 grid using Matplotlib (plt.subplot). For each image, show its corresponding digit label (from y_train) as the subplot title.
In [3]:
indices = # random pick 9 indices

plt.figure(figsize=(6, 6))
for i, idx in enumerate(indices):
    plt.subplot(3, 3, i + 1)
    # squeeze last channel back to 2D for grayscale display
    plt.imshow()
    # show the integer label, not one-hot
    plt.title()
    plt.axis("off")

plt.tight_layout()
plt.show()
No description has been provided for this image

Build the model [50/50]¶

  1. Recall that each MNIST image has shape (28, 28, 1) (height, width, and 1 grayscale channel). Assign this shape to a variable input_shape.

  2. Use keras.Sequential to build a simple two-layer MLP with the following layers:

    • Input layer: accepts images of shape input_shape.
    • Flatten layer: converts each 2D image into a 1D vector.
    • Dense layer: fully connected layer with 128 hidden units and a "sigmoid" activation function.
    • Output layer: fully connected layer with num_classes units (one for each digit 0–9) and "softmax" activation.
  3. Inspect the model: Call model.summary() to display the network architecture, output shapes, and number of parameters in each layer.

In [4]:
input_shape = # define input shape

model = keras.Sequential([
    # Input
    # Flatten
    # Dense with "sigmoid"
    # Dense with "softmax"
])

# summary of the model
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ flatten (Flatten)               │ (None, 784)            │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense)                   │ (None, 128)            │       100,480 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 10)             │         1,290 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 101,770 (397.54 KB)
 Trainable params: 101,770 (397.54 KB)
 Non-trainable params: 0 (0.00 B)

Train the model [20/20]¶

  1. Set training parameters: Choose a batch_size of 128 and train for 20 epochs.

    • Batch size controls how many training examples are processed before updating model weights.
    • Epochs represent how many times the model will see the entire training dataset.
  2. Compile the model: Use the following components:

    • Loss function: CategoricalCrossentropy (since we have multi-class classification with one-hot labels).
    • Optimizer: SGD (stochastic gradient descent) with a learning_rate of 0.02
    • Metrics: track CategoricalAccuracy during training and validation.
  3. Train the model

    • Use model.fit() with the given batch size and epochs.
    • Set validation_split=0.1 so that 10% of the training data is held out for validation at the end of each epoch.
    • Observe the training and validation loss/accuracy printed after each epoch.
In [5]:
batch_size =
pochs =

model.compile(
    loss= # CategoricalCrossentropy()
    optimizer= # SGD
    metrics=[
        # CategoricalAccuracy()
    ]
)

model.fit()
Epoch 1/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.4257 - loss: 2.0869 - val_categorical_accuracy: 0.7995 - val_loss: 1.3782
Epoch 2/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.7783 - loss: 1.2703 - val_categorical_accuracy: 0.8508 - val_loss: 0.8863
Epoch 3/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.8221 - loss: 0.8914 - val_categorical_accuracy: 0.8770 - val_loss: 0.6632
Epoch 4/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.8481 - loss: 0.6984 - val_categorical_accuracy: 0.8892 - val_loss: 0.5471
Epoch 5/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 3s 6ms/step - categorical_accuracy: 0.8576 - loss: 0.6094 - val_categorical_accuracy: 0.8943 - val_loss: 0.4766
Epoch 6/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.8705 - loss: 0.5411 - val_categorical_accuracy: 0.9018 - val_loss: 0.4299
Epoch 7/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.8752 - loss: 0.4957 - val_categorical_accuracy: 0.9075 - val_loss: 0.3970
Epoch 8/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - categorical_accuracy: 0.8822 - loss: 0.4629 - val_categorical_accuracy: 0.9113 - val_loss: 0.3728
Epoch 9/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.8856 - loss: 0.4392 - val_categorical_accuracy: 0.9130 - val_loss: 0.3546
Epoch 10/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.8912 - loss: 0.4159 - val_categorical_accuracy: 0.9145 - val_loss: 0.3397
Epoch 11/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 3s 5ms/step - categorical_accuracy: 0.8896 - loss: 0.4073 - val_categorical_accuracy: 0.9150 - val_loss: 0.3278
Epoch 12/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.8924 - loss: 0.3951 - val_categorical_accuracy: 0.9162 - val_loss: 0.3180
Epoch 13/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.8954 - loss: 0.3827 - val_categorical_accuracy: 0.9178 - val_loss: 0.3098
Epoch 14/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.8957 - loss: 0.3790 - val_categorical_accuracy: 0.9205 - val_loss: 0.3027
Epoch 15/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - categorical_accuracy: 0.8977 - loss: 0.3651 - val_categorical_accuracy: 0.9207 - val_loss: 0.2963
Epoch 16/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - categorical_accuracy: 0.9003 - loss: 0.3606 - val_categorical_accuracy: 0.9222 - val_loss: 0.2910
Epoch 17/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 3s 7ms/step - categorical_accuracy: 0.9000 - loss: 0.3565 - val_categorical_accuracy: 0.9223 - val_loss: 0.2861
Epoch 18/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.9018 - loss: 0.3472 - val_categorical_accuracy: 0.9238 - val_loss: 0.2819
Epoch 19/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.9018 - loss: 0.3476 - val_categorical_accuracy: 0.9242 - val_loss: 0.2778
Epoch 20/20
422/422 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - categorical_accuracy: 0.9047 - loss: 0.3391 - val_categorical_accuracy: 0.9255 - val_loss: 0.2742
Out[5]:
<keras.src.callbacks.history.History at 0x7cb357eb1190>

Evaluate the trained model on the test set [10/10]¶

  1. After training, use model.evaluate(x_test, y_test) to measure how well the model generalizes to unseen data. Store the result in a variable score. Print both the test loss (score[0]) and the test accuracy (score[1]).
In [6]:
score = # model evaluate
print("Test loss:", score[0])
print("Test accuracy:", score[1])
Test loss: 0.3149966299533844
Test accuracy: 0.9133999943733215
In [ ]: