Less confusing Confusion Matrices – Disruptive Engineering

An intro to Confusion Matrices

This post will look at explaining Confusion Matrices for Classification to non-tech people.

So what are Confusion Matrices?

Confusion Matrices are tables that explain how an Algorithm is performing. They are built using a part of the data that was not previously seen by the algorithm – the test set.

Let’s start by considering an algorithm that needs to differentiate between Cats and Dogs. In the test set, there are two types of images the classifier sees:

Images of Dogs – the algorithm will predict correctly that some of them are dogs (in green) and in some cases it will make mistakes and say they are cats (in red)

Images of Cats – the algorithm will make mistakes and say some of them are dogs (in red) and correctly predict that some of them are cats (in green)

A larger table can be built from the above:

And this is a confusion matrix.

Where does it help?

This combination can help us understand how well the algorithm is performing. For example, Accuracy is calculated from the above using the following formula:

You will also come across Confusion Matrices that are presented as follows:

More generally, expect a confusion matrix for a binary classification task will look like this:

Hope you are not still confused!

Request a DEMO