Code 5: Debugging a 1-Hidden-Layer Neural Network Model

Documenting my debugging of a neural network model I built from scratch
coding
debugging
Author

Tony Phung

Published

April 9, 2024

1. Introduction

I’ve coded from scratch a neural network using Kaggle Titanic dataset based on a Jeremy Howard’s popular NN-model.

I noticed descrepancies in Loss between my model and the reference model and will attempt debug My Model (TP) without looking at the Reference Models (RM) code.

2. The Problem

Loss differences (from 2nd epoch onwards):

  • TP-Loss: 0.544 (epoch_1), 0.538 (epoch_2)
  • RM-Loss: 0.543 (epoch_1), 0.532 (epoch_2)

The difference grows per epoch.

3. The Approach

This neural network model only has one-hidden-layer.
I’ve decided to test differences at 3 stages:

  1. Input level (input data, coefficients, and constants)
  2. Intermediary Calculations (hidden layers and relu)
  3. Predictions (predictions and sigmoid)
  4. Update Coefficients (gradients and updated coefficients)

4. The Analysis

4.1 Input Level - Normalised Input Data - idep_mxn

EPOCH 1 and 2: OKAY (data-matching)

4.2 Input Level - Coeffs - Layer 1 - L1_nxq

EPOCH 1: OKAY (data-matching)

4.3 Input Level - Coeffs - Layer 2 - L2_qx1

EPOCH 1: OKAY (data-matching)

4.4 Input Level - Coeffs - Constant - CONST_1

EPOCH 1: OKAY (data-matching)

4.5 Intermediary Calcs - idep@L1 - pred_PSET_HL_mxq

EPOCH 1: OKAY (data-matching)

4.6 Intermediary Calcs - relu(idep@L1) - PSET_HL_mxq

EPOCH 1: OKAY (data-matching)

4.7 Final Preds - PSET_HL_mxq@L2 - PREDS_mx1

EPOCH 1: OKAY (data-matching)

4.8 Final Preds - PREDS_mx1 + CONST_1 - PREDS_C_mx1

EPOCH 1: OKAY (data-matching)

4.9 Final Preds - Sigmoid(PREDS_C_mx1) - SGM_PREDS_C_mx1

EPOCH 1: OKAY (data-matching)

4.10 - Loss -

EPOCH 1: NOT OKAY, Loss values different from 4th decimals

Since Loss is created taking the absolute difference (then mean) between the:

  • predictions and
  • (actual) dependent variables

Lets validate across the neural network models:

  • Dependent Variable (“Survived”)
  • Predictions
Model Loss
TP 0.5433918237686157
RM 0.5439100861549377

5. The Bug

5.1 Input Level - Dep Variable - dep_mx1

EPOCH 1: NOT OKAY:- **Dimensions are different!

Found the Bug!

  • TP-dimensions: [713,1]
  • RM-dimensions: [713]

6. The Fix

6.1 Adding Trailing Dimension [:,None]

Solution: Add trailing dimesion for dependent variables, fixing the predictions calculation, thus loss.

6.2 Check New Loss

It matches EXACTLY!

7. Conclusion

It goes to show how important getting the correct dimensions can change things so subtley and materially at the same time.