Author

Tony Phung

Published

April 9, 2024

1. Introduction

I’ve coded from scratch a neural network using Kaggle Titanic dataset based on a Jeremy Howard’s popular NN-model.

I noticed descrepancies in Loss between my model and the reference model and will attempt debug My Model (TP) without looking at the Reference Models (RM) code.

2. The Problem

Loss differences (from 2nd epoch onwards):

TP-Loss: 0.544 (epoch_1), 0.538 (epoch_2)
RM-Loss: 0.543 (epoch_1), 0.532 (epoch_2)

The difference grows per epoch.

3. The Approach

This neural network model only has one-hidden-layer.
I’ve decided to test differences at 3 stages:

Input level (input data, coefficients, and constants)
Intermediary Calculations (hidden layers and relu)
Predictions (predictions and sigmoid)
Update Coefficients (gradients and updated coefficients)

4. The Analysis

4.1 Input Level - Normalised Input Data - `idep_mxn`

EPOCH 1 and 2: OKAY (data-matching)

4.2 Input Level - Coeffs - Layer 1 - `L1_nxq`

EPOCH 1: OKAY (data-matching)

4.3 Input Level - Coeffs - Layer 2 - `L2_qx1`

EPOCH 1: OKAY (data-matching)

4.4 Input Level - Coeffs - Constant - `CONST_1`

EPOCH 1: OKAY (data-matching)

4.5 Intermediary Calcs - idep@L1 - `pred_PSET_HL_mxq`

EPOCH 1: OKAY (data-matching)

4.6 Intermediary Calcs - relu(idep@L1) - `PSET_HL_mxq`

EPOCH 1: OKAY (data-matching)

4.7 Final Preds - PSET_HL_mxq@L2 - `PREDS_mx1`

EPOCH 1: OKAY (data-matching)

4.8 Final Preds - PREDS_mx1 + CONST_1 - `PREDS_C_mx1`

EPOCH 1: OKAY (data-matching)

4.9 Final Preds - Sigmoid(PREDS_C_mx1) - `SGM_PREDS_C_mx1`

EPOCH 1: OKAY (data-matching)

4.10 - Loss -

EPOCH 1: NOT OKAY, Loss values different from 4th decimals

Since Loss is created taking the absolute difference (then mean) between the:

predictions and
(actual) dependent variables

Lets validate across the neural network models:

Dependent Variable (“Survived”)
Predictions

Model	Loss
TP	0.5433918237686157
RM	0.5439100861549377

5. The Bug

5.1 Input Level - Dep Variable - `dep_mx1`

EPOCH 1: NOT OKAY:- **Dimensions are different!

Found the Bug!

TP-dimensions: [713,1]
RM-dimensions: [713]

6. The Fix

6.1 Adding Trailing Dimension `[:,None]`

Solution: Add trailing dimesion for dependent variables, fixing the predictions calculation, thus loss.

6.2 Check New Loss

It matches EXACTLY!

7. Conclusion

It goes to show how important getting the correct dimensions can change things so subtley and materially at the same time.