An intro to Confusion Matrices
This post will look at explaining Confusion Matrices for Classification to non-tech people.
So what are Confusion Matrices?
Confusion Matrices are tables that explain how an Algorithm is performing. They are built using a part of the data that was not previously seen by the algorithm – the test set.
Let’s start by considering an algorithm that needs to differentiate between Cats and Dogs. In the test set, there are two types of images the classifier sees:
- Images of Dogs – the algorithm will predict correctly that some of them are dogs (in green) and in some cases it will make mistakes and say they are cats (in red)
- Images of Cats – the algorithm will make mistakes and say some of them are dogs (in red) and correctly predict that some of them are cats (in green)
A larger table can be built from the above:
And this is a confusion matrix.
Where does it help?
This combination can help us understand how well the algorithm is performing. For example, Accuracy is calculated from the above using the following formula:
You will also come across Confusion Matrices that are presented as follows:
More generally, expect a confusion matrix for a binary classification task will look like this:
Hope you are not still confused!